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Preface 


This edition ushers in a new era for our book in three important ways. First, after 21 
years of publication by Holden-Day, the book was purchased by McGraw-Hill in 
1988. Thus, this is the first edition produced by our new publisher. We look forward 
to a long and fruitful collaboration with McGraw-Hill. 

Second, the book now is also available in two more compact split voluines, 
Introduction to Mathematical Programming and Introduction to Stochastic Models in 
Operations Research, for use in more specialized courses. All of the material in this 
fifth edition has been duplicated in one or the other of the split volumes (and occa- 
sionally in both). The split volumes also include material from previous editions that 
was subsequently deleted for space reasons, so these sections are somewhat more 
comprehensive. 

Third, this edition also inaugurates the incorporation of software that has been 
specifically designed to be a teaching supplement to this book. This software is pack- 
aged in the back of the book for use on either an IBM (or IBM-compatible) personal 
computer with a graphics card or a Macintosh. As described in Sec. 1.6, this is very 
different from the usual software now widely available for running the algorithms of 
operations research on microcomputers. Instead, it is true tutorial software for helping 
to learn the material in this book (or the two split volumes). The demonstration 
examples, the routines for interactive execution of algorithms, and the routines for 
post-optimality analysis should make the learning process far more efficient and ef- 
fective as well as more stimulating and enjoyable. We are convinced that this type of 
innovative software will usher in a new era for operations research education. 

The major reorganization of the book for the fourth edition has been received 
so well that we have kept it intact for this edition: Within this organization, several 
new sections have been added, several rewritten sections have been renamed, .and the 
title of Chap. 15 has been changed from Stochastic Processes to Markov Chains. 
Except for updating, there have been no major deletions of material. 

Although there are no new chapters, we have added considerable new material 
within existing chapters to reflect important recent developments. The major new 
topics are (1) a nontechnical introduction and evaluation of the new interior-point 
approach to solving linear programming problems proposed by N. Karmarkar (see the 
beginning of Sec. 4.9), (2) a technical introduction to this interior-point approach that 
highlights the most important concepts in aii elementary way (see Sec. 9.4), (3) the 
emerging new role for microcomputers in implementing some of the algorithms of 
operations research, including linear programming (see the end of Sec. 4.9), (4) the 
minimum cost flow problem (capacitated transshipment problem) and its special cases 
(see Sec. 10.6), (5) the network simplex method for solving minimum cost flow 


Xv 


Xvi 


Preface 


problems (see Sec. 10.7), (6) an introduction to a recent algorithmic breakthrough for 
solving large binary integer programming problems by combining automatic problem 
preprocessing and the generation of cutting planes with clever branch-and-bound tech- 
niques (see Sec. 13.6), (7) Jackson queueing networks (see the end of Sec. 16.9), (8) 
a continuous review inventory model with fixed lag and no backlogging permitted 
(see the end of Sec. 18.4), and (9) forecasting in the presence of seasonal effects (see 
Sec. 19.6). 

In addition to these new topics, ten chapters received major revisions to increase 
the clarity of the presentation as well as to update material as needed. Five of these 
(Chaps. 3 to 7) are linear programming chapters. The most important revisions in 
these five chapters are (1) a major new ongoing example in nonstandard form (the 
radiation theory example) to complement the Wyndor Glass Co. example (see Secs. 
3.4, 4.6, and 6.4), (2) many new illustrations of basic concepts, (3) clarification of 
the assumptions of linear programming (see Sec. 3.3), (4) improved presentation of 
the simplex method (see Chaps. 4 and 5), (5) expanded coverage of the two-phase 
method for the simplex method (see Sec. 4.6), (6) clearer geometric interpretation of 
the simplex method, including in three dimensions (see Sec. 5.1), (7) clarification of 
the ‘‘fundamental insight’’ for the simplex method (see Sec. 5.3), (8) improved treat- 
ment of sensitivity analysis, including the. geometric interpretation, incremental anal- 
ysis, and parametric programming (see Secs. 6.6 and 6.7), (9) further explanation of 
the transshipment. problem (see Sec. 7.3), and (10) expanded coverage of the assign- 
ment problem (see Sec. 7.4). 

Chapter 10 (Network Analysis, Including PERT-CPM) has been mostly rewrit- 
ten to reflect the increasing importance of this material, update the terminology, 
incorporate the new sections on the minimum cost flow problem and the network 
simplex method, and emphasize the intimate relationship with linear programming. 
For those instructors who prefer. the ‘‘modern’’.network interpretation of the trans- 
portation problem, the transshipment problem, and the assignment. problem, Secs. 
10.6 and 10.7 emphasize this interpretation, whereas Chap. 7 retains the ‘‘classical’’ 
approach preferred by many. 

Much of Chap. 13. (Integer Programming) has been rewritten to update and 
improve the presentation of the branch-and-bound technique and its use for the al- 
gorithms of integer programming. The availability of the new software for efficiently 
doing homework problems by interactively applying branch-and-bound algorithms 
based on linear. programming relaxations now enables increasing our focus. on this 
important. approach. Therefore; the old assignment problem example and Balas’ 
additive algorithm have been discarded in favor of this approach. 

_ Three of the chapters on probabilistic: models have been thoroughly revised. 
Chapter 15 (Markov Chains) has. been rewritten, and it now includes more examples 
and applications: Much of Chap. 19 (Forecasting) has been rewritten so as to present 
an elementary, but rigorous, treatment of a subject that is becoming increasingly 
important. Chapter 23 (Simulation) also received a major revision, with considerable 
updating and an increased emphasis on the major building blocks of a simulation 
model. 

Finally, the other thirteen chapters: received a careful review, with updating as 
needed, considerable polishing, and substantial revision of occasional sections. A 
considerable number of new problems also have been added throughout the book. 

The overall thrust of all the revision efforts has been to build upon the strengths 
of previous editions while thoroughly updating the material and adding the new soft- 


ware to fully meet the needs of the 1990s.:We think that the net effect has been to 
make this edition even more of a ‘‘student’s book’’—clear, interesting, and well- 
organized with lots of helpful examples and illustrations, good motivation and per- 
spective, easy-to-find important material, enjoyable homework, and so on, and without 
too much notation, terminology, and dense mathematics. We believe and trust that 
the numerous instructors who have used previous editions will agree that this is the 
best edition yet. 

The prerequisites for a course using this book can be relatively modest. As with 
previous editions, the mathematics has been kept at a relatively elementary level. 
Most of Parts 2 and 3 (Linear Programming and Mathematical Programming, respec- 
tively) requires no mathematics beyond high school algebra. Calculus is used only in 
Chap. 14 (Nonlinear Programming) and in one example in Chap. 11 (Dynamic Pro- 
gramming). Matrix notation is used in Chap. 5 (The Theory of the Simplex Method), 
Chap. 6 (Duality Theory and Sensitivity Analysis), Sec. 9.4 (An Interior-Point Al- 
gorithm), and Chap. 14, but the only background needed for this is presented in 
Appendix 3. For Part 4 (Probabilistic Models), a previous introduction to probability 
theory is assumed, and calculus is used in a few places. In general terms, the math- 
ematical maturity that a student achieves through taking an elementary calculus course 
is useful throughout Part 4 and for the more advanced material in Parts 2 and 3. 

The content of the book is aimed largely at the upper division undergraduate 
level and at first-year (master’s level) graduate students. There are many ways to 
package the material into a course. The book has great flexibility. Part 1 is an intro- 
duction to the subject of operations research. Part 2 (on linear programming) or Parts 
2 and 3 (on mathematical programming) may essentially be covered independently of 
Part 4 (on probabilistic models), and vice versa. Furthermore, the chapters in Parts 2 
and 3 are almost independent, except that they all use basic material presented in 
Chap. 3 and perhaps Chap. 4. Chapter 6 and Sec. 9.4 also draw upon Chap. 5. 
Sections 9.2 and 9.3 use parts of Chap. 6. Section 10.6 assumes an acquaintance with 
the problem formulations in Secs. 7.1, 7.3, and 7.4, while prior exposure to Secs. 
7.2 and 9.1 is helpful (but not essential) in Sec. 10.7. Within Part 4, there is consid- 
erable flexibility of coverage, although some integration of the material is available. 

An elementary survey course covering mathematical programming and some 
probabilistic models can be presented in a quarter (40 hours) or semester by selectively 
drawing from material in all four parts of the book. For example, a good survey of 
the field can be obtained from Chaps. 1, 2, 3, 4, 8, 10, 11, 16, 18, 19, 22, and 23. 
A more extensive elementary survey course can be completed in two quarters (60 to 
80 hours) by excluding just a few chapters, for example, Chaps. 9, 12, 20, and 21. 
Chapters 1 to 9 form an excellent basis for a (one-quarter) course in linear program- 
ming. The material in Chaps. 10 to 14 covers topics for another (one-quarter) course 
in other deterministic models. Finally, the material in Chaps. 15 to 23 covers the 
probabilistic (stochastic) models of operations research suitable for presentation in a 
(one-quarter) course. In fact, these latter three courses (the material in the entire text) 
can be viewed as a basic 1-year sequence in the techniques of operations research, 
forming the core of a master’s degree program. Each course outlined is currently being 
presented at either the undergraduate or graduate level at Stanford University, and 
this text has been used in the manner suggested. 

Again, as in previous editions, we thank our wives, Ann and Helen, for their 
editorial and word processing assistance, as well as their encouragement and under- 
standing when we devoted too many evenings and weekends to preparing this fifth 
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edition. Our children, David, John, and Mark Hillier, Janet Lieberman Argyres, and 
Joanné, Michael, and Diana Lieberman have literally grown up with the book and 
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our periodic hibernations to prepare a new edition. Now, most of them have used the 
book as a text in their own college courses, given considerable advice, and even (in 
the case of Mark Hillier) become a full-fledged collaborator. It is a joy to see them 
and (we trust) the book reach maturity. together. 
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The Nature of 
Operations Research 


1.1 The Origins of Operations Research 


Since the advent of the industrial revolution, the world has seen a remarkable growth 
in the size and complexity of organizations. The artisans’ small shops of an earlier 
era have evolved into the billion-dollar corporations of today. An integral part of this 
revolutionary change has been a tremendous increase in the division of labor and 
segmentation of management responsibilities in these organizations. The results have 
been spectacular. However, along with its blessings, this increasing specialization has 
created new problems, problems that are still occurring in many organizations. One 
problem is a tendency for the many components of an organization to grow into 
relatively autonomous empires with their own goals and value systems, thereby losing 
sight of how their activities and objectives mesh with those of the overall organization. 
What is best for one component frequently is detrimental to another, so they may end 
up working at cross purposes. A related problem is that as the complexity and spe- 
cialization in an organization increase, it becomes more and more difficult to allocate 
its available resources to its various activities in a way that is most effective for the 
organization as a whole. These kinds of problems and the need to find a better way 
to resolve them provided the environment for the emergence of operations research. 
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Introduction 


The roots of operations research can be traced back many decades, when early 
attempts were made to use a scientific approach in the management of organizations. 
However, the beginning of the activity called operations research has generally been 
attributed to the military services early in World War II. Because of the war effort, 
there was an urgent need to allocate scarce resources to the various military operations 
and to the activities within each operation in an effective manner. Therefore the British 
and then the American military management called upon a large number of scientists 
to apply a scientific approach to dealing with this and other strategic and tactical 
problems. In effect they were asked to do research on (military) operations: These 
teams of scientists were the first operations research teams. Their efforts allegedly 
were instrumental in winning the Air Battle of Britain, the Island Campaign in the 
Pacific, the Battle of the North Atlantic, and others. 

Spurred on by the apparent success:of. operations research in. the military, in- 
dustry gradually became interested in this new field. As the industrial boom following 
the war was running its course, the problems caused by the increasing complexity and 
specialization in organizations were again coming to the forefront. It was becoming 
apparent to a growing number of people, including business consultants who had 
served on or with the operations research teams during the war, that these were 
basically the same problems that had been faced by the military but in a different 
context. In this way operations research began to creep into industry, business, and 
civil government. By 1951, it had already taken hold in Great Britain and was in the 
process of doing so in the United States. Since then the field has developed very 
rapidly, as will be described further in Sec. 1.3. 

At least two other factors that played a key role in the rapid growth of operations 
research during this period can be identified. One was the substantial progress that 
was made early in improving the techniques available to operations research. After 
the war, many of the scientists who had participated on operations research teams or 
who had heard about this work were motivated to pursue research relevant to the field; 
important advancements in the state of the art resulted. A prime example is the simplex 
method for solving linear programming problems, developed by George Dantzig in 
1947. Many of the standard tools of operations research, e.g., linear programming, 
dynamic programming, queueing theory, -and inventory theory, were relatively well 
developed before the end of the 1950s. In addition to this rapid advancement in the 
theory of operations research, a second factor that gave great impetus to the growth 
of.the field was the onslaught of the computer. revolution. A large amount of com- 
putation is usually required to deal most effectively with the complex problems typ- 
ically considered by operations research. Doing this by hand would often be out of 
the. question. Therefore, the development: of electronic digital computers, with their 
ability to perform arithmetic calculations thousands or even millions of times. faster 
than a human being can, was a tremendous boon to operations research. Today, 
mainframe computers, minicomputers, or microcomputers are essential for solving 
real-world operations research problems. 


1.2 The Nature of Operations Research 


What is operations research? One way of trying to answer this question is to give a 
definition. For example, operations research may be described as a scientific approach 
to decision making that involves the operations of organizational systems. However, 


this description, like earlier attempts at a definition, is so general that it is equally 
applicable to many other fields as well. Therefore, perhaps the best way of grasping 
the unique nature of operations research is to examine its outstanding characteristics. 

As its name implies, operations research involves ‘‘research on operations.” 
This says something about both the approach and the area of application of the field. 
Thus operations research is applied to problems that concern how to conduct and 
coordinate the operations or activities within an organization. The nature of the or- 
ganization is essentially immaterial, and, in fact, operations research has been applied 
extensively in business, industry, the military, civil government and agencies, hos- 
pitals, and so forth. Therefore, the breadth of application is unusually wide. The 
approach of operations research is that of the scientific method. In particular, the 
process begins by carefully observing and formulating the problem and then con- 
structing a scientific (typically mathematical) model that attempts to abstract the es- 
sence of the real problem. It is then hypothesized that this model is a sufficiently 
precise representation of the essential features of the situation, so that the conclusions 
(solutions) obtained from the model are also valid for the real problem. This hypothesis 
is then modified and verified by suitable experimentation. Thus in a certain sense 
operations research involves creative scientific research into the fundamental properties 
of operations. However, there is more to it than this. Specifically, operations research 
is also concerned with the practical management of the organization. Therefore, to be 
successful. it must also provide positive, understandable conclusions to the decision 
maker(s) when they are needed. 

Still another characteristic of operations research is its broad viewpoint. As 
implied in the preceding section, operations research adopts an organizational point 
of view. Thus it attempts to resolve the conflicts of interest among the components 
of the organization in a way that is best for the organization as a whole. This does 
not imply that the study of each problem must give explicit consideration to all aspects 
of the organization; rather, the objectives being sought must be consistent with those 
of the overall organization. An additional characteristic that was mentioned in passing 
is that operations research attempts to find the best or optimal solution to the problem 
under consideration. Rather than being content with merely improving the status quo, 
the goal is to identify the best possible course of action. Although it must be interpreted 
carefully, this ‘‘search for optimality’? is a very important theme in operations 
research. 

All these characteristics lead quite naturally to still another one. It is evident 
that no single individual should be expected to be an expert on all the many aspects 
of operations research work or the problems typically considered; this would require 
a group of individuals having diverse backgrounds and skills. Therefore, when un- 
dertaking a full-fledged operations research study of a new problem, it is usually 
necessary to use a team approach. Such an operations research team typically needs 
to include individuals who collectively are highly trained in mathematics, statistics 
and probability theory, economics, business administration, electronic computing, 
engineering and the physical sciences, the behavioral sciences, and the special tech- 
niques of operations research. The team also needs to have the necessary experience 
and variety of skills to give appropriate consideration to the many ramifications of the 
problem throughout the organization and to execute effectively all the diverse phases 
of the operations research study. 

In summary, operations research is concerned with optimal decision making in, 
and modeling of, deterministic and probabilistic systems that originate from real life. 
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These applications, which occur in government, business, engineering, economics, 
and the natural and social sciences, are characterized largely by the need to allocate 
limited resources. In these situations, considerable insight can be obtained from sci- 
entific analysis such as that provided by operations research. The contribution from 
the operations research approach stems primarily from: 


1. Structuring the real-life situation into a mathematical model, abstracting the 
essential elements so that a solution relevant to the decision maker’s objec- 
tives can be sought. This involves looking at the problem in the context of 
the entire system. 

2. Exploring the structure of such solutions and developing systematic proce- 
dures for obtaining them. 

3. Developing a solution, including the mathematical theory, if necessary, that 
yields an optimal value of the system measure of desirability (or possibly 
comparing alternative courses of action by evaluating their measure of de- 
sirability). 


1.3 The Impact of Operations Research 


Operations research has had an increasingly great impact on the management of or- 
ganizations in recent years. Both the number and the variety of its applications con- 
tinue to grow rapidly, and no slowdown is in sight. In fact, with the exception of the 
advent of the electronic computer, the extent of this impact seems to be unrivaled by 
that of any other recent development. 

After their success with operations research during World War H, the British 
and American military services continued to have active operations research groups, 
often at different levels of command. As a result, there now exists a large number of 
people called ‘‘military operations researchers’? who are applying an operations re- 
search approach to problems of national defense. For example, they engage in tactical 
planning for requirements and use of weapon systems as well as consider the larger 
problems of the allocation and integration of effort. Some of their techniques involve 
quite. sophisticated ideas in political science, mathematics, economics, probability 
theory, and statistics. 

Operations research is also being used widely in other types of organizations, 
including business and industry. Sometimes the term ‘‘management science’’ is used 
as a designation for this activity. Almost-all the dozen or so largest corporations in 
the world, and a sizable proportion of the small industrial organizations, either have 
well-established operations research groups or have integrated the activity into the 
regular components of the organization. Many industries, including aircraft and mis- 
sile, automobile, communication, computer, electric power, electronics, food, metal- 
lurgy, mining, paper, petroleum, and transportation, have made. widespread use of 
operations research. Financial institutions, governmental agencies, and hospitals are 
rapidly increasing their use of operations research. 

As an example of the impact of operations research in government, the Presi- 
dent’s Commission on Aviation Safety issued its report in April 1988 on the strengths, 
weaknesses, and problems of the airspace system along with 15 recommendations for 
change. One of these recommendations stated that ‘‘Operations research (applied 
mathematics methods and models for solving complex operations problems) should 
be recognized as a standard approach for problem solving in the FAA.” 


To be more specific, consider some of the problems that have been solved by 
particular techniques of operations research. Linear programming has been used suc- 
cessfully in the solution of problems concerned with assignment of personnel, blending 
of materials, distribution and transportation, and investment portfolios. Dynamic pro- 
gramming has been applied successfully to such areas as planning advertising ex- 
penditures, distributing sales effort, and production scheduling. Queueing theory has 
had application in solving problems concerned with traffic congestion, servicing ma- 
chines subject to breakdown, determining the level of a service force, air traffic 
scheduling, design of dams, production scheduling, and hospital operation. Other 
techniques of operations research, such as inventory theory, game theory, and simu- 
lation, also have been successfully applied to a variety of contexts. 

The extent of operations research activities, and the profile of operations research 
practitioners, in U.S. corporations have been frequently surveyed. Among the surveys 
that emphasized the companies in Fortune’s list of the top 500 corporations are the 
following: 


1. In 1972, Turban! reported on a survey of operations research activities that 
provide a snapshot of activities in 1969. Mail questionnaires were sent to 
the directors of operations research/management science of 475 companies. 
These companies were selected from Fortune’s list of the top 500, using the 
300 largest industrial corporations, 50 industrial corporations drawn from 
the companies ranking between 300 and 500, and the 25 largest companies 
in each of the service categories, banks, utilities, merchandising, life insur- 
ance, and transportation. There were 107 questionnaires returned. 

2. In 1977, Ledbetter and Cox? published the results of a survey of Fortune’s 
500 firms (1975 listing) concerning utilization of operations research tech- 
niques in their firms. There were 176 respondents. 

3. In 1979, Thomas and DaCosta’ reported on a survey of operations research 
activities in 1977. Mail questionnaires were sent to 420 individual corpo- 
rations, including 260 firms from Fortune’s 1975 list of the top 500, the 
largest 100 industrial firms in California, and the balance from California 
financial institutions. There were 150 questionnaires returned. 

4. Finally, in 1983, Forgionne* published the results of a survey of corporate 
usage of operations research/management science. A questionnaire was 
mailed in 1982 to a random sample of 500 corporations drawn from the 
1,500 largest U.S. corporations. There were 125 respondents. 


Both Turban and Thomas and DaCosta indicated that nearly half of the com- 
panies reporting had a special department that was engaged mainly in operations 
research/management science (OR/MS) activities. While this ratio has remained vir- 
tually constant, and since all of the respondents used OR/MS techniques, Thomas 
and DaCosta concluded that ‘‘management science is becoming a part of the everyday 


' Turban, E.: “A Sample Survey of Operations Research Activities at the Corporate Level,” Operations 
Research, 20:708-721, 1972. 


? Ledbetter, W. N., and J. F. Cox: ‘Are OR Techniques Being Used,” Industrial Engineering, pp. 19-21, 
February 1977. 


> Thomas, G., and J. DaCosta: ‘‘A Sample Survey of Corporate Operations Research,” Interfaces, 9:102- 
111, 1979. 


* Forgionne, G. A.: ‘Corporate Management Science Activities: An Update,” Interfaces, 13:20-23, 1983. 
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Table 1.1 Ranking of the Techniques of Operations Research/Management Science* 





Ledbetter Thomas 

Turban and Cox and DaCosta Forgionne 

(1969) (1975) (1977) (1982) 
Bayesian decision analysis — — 9 — 
Delphi — — 13.5 — 
Dynamic programming 6 6 10 7 
Financial methods — — 13.5 — 
Game theory — 7 — 8 
Heuristic programming 8.5 — 8 — 
Integer and mixed programming — _ 12 _ 
Inventory theory 4 — 5 — 
Linear programming 3 2 3 4 
Network models — 4 — — 
Nonlinear programming 7 — 7 6 
PERT/CPM 5 — 4 3 
Risk analysis — — il _ 
Queueing theory 8.5 5 6 5 
Simulation 3 2 2 
Statistical analysis 1 1 1 1 


* Rank 1 denotes the most frequently used technique. 


activities of the modern firm and therefore is no longer a specialized function to be 
undertaken by a separate specialized department.’’ An interesting finding of the Turban 
survey was that almost all of the specialized departments reported to the company 
president, vice-president, or controller. 

All of the surveys attempted to find which techniques of OR/MS were most 
frequently used. Table 1.1 presents.a ranking of the techniques of operations research 
that are being applied in U.S. corporations based upon the aforementioned surveys. 
It is interesting to note that all four surveys, representing a span of 14 years, are in 
basic agreement. Statistical analysis, simulation, and linear programming were the 
most widely used techniques, although PERT/CPM, inventory theory, and queueing 
theory were not far behind. 

As noted earlier, OR/MS techniques are applied to a broad spectrum of cor- 
porate problem areas, and two of the surveys dealt with this issue. Table 1.2 presents 
a ranking of the application areas given in the Thomas and DaCosta and Forgionne 
surveys. The classification of application areas differs somewhat in these two surveys, 
but it is evident that capital budgeting, forecasting, inventory control, production 
planning, and project planning are the most.popular application areas. 

In 1976, Fabozzi and Valente! reported on the results of a questionnaire mailed 
to 1,000 firms in the United States in November 1974 concerning the use of mathe- 
matical programming (linear, nonlinear, and dynamic). There were 184 responses 
received. The authors found that the most important area of application of mathe- 
matical programming was production management (determination of product mix, 
allocation of resources, plant and machine scheduling, and work scheduling). The 
next largest area of application was financial and investment planning (capital bud- 


1 Fabozzi, F. J., and J. Valente: ‘‘Mathematical Programming in American Companies: A Sample Survey,” 
Interfaces, 7(1):93-98, November 1976. 


Table 1.2 Ranking of the Application Areas of Operations 
Research/Management Science* 


Thomas 
and DaCosta Forgionne 

(1977) (1982) 
Accounting 11 5 
Advertising and sales research 8 = 
Capital budgeting 4 2 
Equipment replacement 9 = 
Forecasting — market planning 1 6 
Inventory control 2.5 4 
Maintenance 10 9 
Packaging 12 — 
Personnel management — 10 
Piant location 6 8 
Production planning and scheduling 2:9 3 
Project planning — 1 
Quality control 7 7 
Transportation 5 — 


* Rank 1 denotes the most frequent application area. 


geting, cash flow analysis, portfolio management for the employee pension fund, cash 
management, and merger and acquisitions analysis). The quality of results reported 
by these firms is given in Table 1.3. 

Because of the great impact of operations research, professional societies de- 
voted to this field and related activities have been founded in a number of countries 
throughout the world. In the United States, the Operations Research Society of Amer- 
ica (ORSA), established in 1952, and The Institute of Management Sciences (TIMS), 
founded in 1953, each has over 6,000 members. ORSA publishes the journal Oper- 
ations Research and TIMS Management Science. The two societies also jointly publish 
Mathematics of Operations Research and Interfaces. These four journals contain well 
over 3,000 pages per year reporting new research and applications in the field. In 
addition, there are many other similar journals published in such countries as the 
United States, England, France, India, Japan, Canada, and West Germany. Indeed, 
there are 32 member countries (including the United States) in the International Fed- 
eration of Operational Research Societies (IFORS), with each country having a na- 
tional operations research society. 


Table 1.3 Quality of Results Reported by Firms Employing Mathematical Programming 
(Fabozzi and Valente Survey) 





Linear Nonlinear Dynamic 
Programming Programming Programming 
Results No. Percent No. Percent No. Percent 
Good 102 76 38 57 27 53 
Fair 21 16 19 28 15 29 
Poor 6 3 6 9 3 6 
Uncertain 7 5 4 6 6 12 
Total 133 100 67 100 51 100 
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Operations research also has had considerable impact in colleges and universi- 
ties. Today most of the major American universities offer courses in this field, and 
many offer advanced degrees that are either in or with specialization in operations 
research. As a result, there are now thousands of students taking at least one course 
in operations research each year. Much of the basic research in the field is also being 
done in the universities. 


1.4 Training for a Career in Operations Research 





Because of the great growth of operations research, career opportunities in this field 
appear to be outstanding. The demand for trained people continues to far exceed the 
supply, and both attractive starting positions and rapid advancement are readily avail- 
able. Because of the nature of their work, operations research groups tend to have a 
prominent staff position, with access to higher-level management in the organization. 
The problems they work on tend to be important, challenging, and interesting. There- 
fore, any individual with a mathematics and science orientation who is also interested 
in the practical management of organizations is likely to find a career in operations 
research very rewarding. 

Three complementary types of academic training are particularly relevant for a 
career in operations research. The first is basic training in the fundamentals upon 
which operations research is based. This includes the basic methodology of mathe- 
matics and science as well as such topics as linear algebra and matrix theory, prob- 
ability theory, statistical inference, stochastic processes, computer science, micro- 
economics, accounting and business administration, organization theory, and the 
behavioral sciences. 

A second important type of training is in operations research per se, including 
special techniques of the field such as linear and nonlinear programming, dynamic 
programming, inventory theory, network flow theory, queueing models, reliability, 
game theory, and simulation. It should also include an introduction to the methodology 
of operations research, where the various techniques and their role in an operations 
research study involving specific problem areas would be placed in perspective. Often 
courses covering certain of these topics are offered in more than one department within 
a university, including departments of business, industrial engineering, mathematics, 
statistics, computer science, economics, and electrical engineering. This is a natural 
reflection of the broad scope of application of the field. Since it does spread across 
traditional disciplinary lines, separate programs or departments in operations research 
also are being established in some universities. 

Finally, it is also good to have. specialized training in some field other than 
operations research, for example, mathematics, statistics, industrial engineering, busi- 
ness, or economics. This additional training provides one with an area of special 
competence for applying operations research, and it should make that person a more 
valuable member of an operations research team. 

The early operations researchers were people whose primary training and work 
had been in some traditional field, such as physics, chemistry, mathematics, engi- 
neering, or economics. They tended to have little or no formal education in operations 
research per se. However, as the body of special knowledge has expanded, it has 
become increasingly more difficult to enter the field without considerable prior edu- 


Table 1.4 Educational Background of Operations Research Personnel (Turban Survey) 11 













Percentage of Total at Degree Level The Nature of 
‘All Degree Operations Research 
Major Field of Study Bachelor’s Master’s Doctorate Levels 
Operations research and 
management science 3: 24 32 12 
Mathematics and statistics 26 . 16 21 22 
Business administration 20 27 2 22 
Engineering 34 17 29 28 
Other 17 16 16 16 


Percentage of total 27 53 20 





cation in this area. As a result, although it is still.common for new operations re- 
searchers to have their college degree(s) in a traditional field, they generally have 
specialized too in operations research as part of their academic program. The tradi- 
tional fields that have most commonly served as a vehicle into operations research are 
indicated in Table 1.4, which is based on the 1972 survey by Turban described in the 
preceding section. The Thomas and DaCosta survey confirms the diversity of the 
educational backgrounds of OR/MS practitioners. They noted that the percentage of 
Ph.D.’s has decreased from 20 percent to 13 percent during the time interval of the 
two surveys, which they speculate may be due to the ‘‘maturity’’ of the OR/MS 
techniques in industry and to the lack of need for separate specialized OR/MS 
departments. 

Finally, in 1982! a survey of TIMS membership provided a profile of its mem- 
bership, including information on educational training, job activity, and compensation 
for professionals in industry, government, universities, and consulting. This report 
reinforced the point that, as the profession has matured, fewer people with formal 
training in non—operations research fields are entering the profession than occurred in 
the previous decades. 


1.5 The Road Ahead 


As an introduction to operations research, this book is designed to acquaint students 
with the formulation, solution, and implementation of operations research models for 
analyzing complex systems problems in industry or government. Part 1 introduces the 
reader to the field of operations research. It provides an overview of the operations 
research modeling approach and describes the major phases of a typical operations 
research study. Part 2 presents the topic of linear programming, a prominent area of 
operations research concerned largely with how to allocate limited resources among 
the various activities of an organization. Part 3 deals with the broad topic of mathe- 
matical programming, including integer and nonlinear programming. Part 4 considers 
a number of probabilistic models that take into account the uncertainty associated with 
future events in order to analyze certain important problems. 


1 Hall, J. R., Jr.: ‘Career Paths and Compensation in Management Science: Results of a TIMS Membership 
Survey,” Interfaces, 14(3):15-23, May-June 1984. 
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Much of the material presented in Parts 2, 3, and 4 can be described in terms 
of typical examples of situations that are encountered in practice. Synopses of several 
such examples are presented here, with detailed solutions given in successive chapters. 

The technique of linear programming is illustrated by a company that operates 
a reclamation center that collects several types of solid waste materials and then treats 
them so they can be amalgamated into a salable product. Different grades of this 
product can be made, depending upon the mix of the materials used. Although there 
is some flexibility in the mix for each grade, quality standards do specify a minimum 
or maximum percentage (by weight) of certain materials allowed in that product grade. 
Data are available on the cost of amalgamation and the selling price for each grade. 
The reclamation center collects its solid waste materials from some regular sources 
and so is normally able to maintain a steady production rate for treating these materials. 
Furthermore, the quantities available for collection and treatment each week, as well 
as the cost of treatment, for each type of material are known. Using the given infor- 
mation, the company is to determine just how much of each product grade to produce 
and the exact mix of materials to be used. for each grade so as to maximize their total 
weekly profit (total sales income minus the total costs of both amalgamation and 
treatment). 

Another example of linear programming concerns a steel producer who is facing 
an air pollution problem caused by pollutants emanating from the manufacturing plant. 
The three main types of pollutants in the airshed are particulate matter, sulfur oxides, 
and hydrocarbons. New standards require that the company reduce its annual emission 
of these pollutants. The steelworks has two primary sources of pollution, namely, the 
blast furnaces for making pig iron and the open-hearth furnaces for changing iron into 
steel. In both cases the engineers have decided. that the most effective types of abate- 
ment methods are (1) increasing the height of the smokestacks, (2) using filter devices 
(including gas traps) in the smokestacks, and (3). including cleaner high-grade mate- 
rials among the fuels for the furnaces. All these methods have known technological 
limits on how much emission they can eliminate. Fortunately, the methods can be 
used at any fraction of their abatement capacities. A cost analysis results in estimates 
of the total annual cost that is incurred by each abatement method when used by blast 
and open-hearth furnaces (cost of less-than-full-capacity use of a method is essentially 
proportional to its fractional capacity). With use of the aforementioned data, the 
optimal plan (minimum cost) for pollution abatement is to be determined. This plan 
would consist of specifying. which types of abatement method would be used and at 
what fractions of their abatement capacities for (1) blast furnaces and (2) open-hearth 
furnaces. 

One of the important special types of linear programming problems is called the 
transportation problem; a typical example deals. with a company producing canned 
peas. The peas are prepared at several distantly located canneries and then shipped 
by truck to distributing warehouses throughout the western United States. Because 
the shipping costs are a major expense, management is initiating a study to reduce 
them as much as possible. For the upcoming season, an estimate has been made of 
what the output will be from each cannery, and each warehouse has been allocated a 
certain amount from the total supply of peas. This information (in units of truckloads), 
along with the shipping cost per truckload for each cannery-warehouse combina- 
tion, is given. Using the data, the optimal plan for assigning these shipments to the 
various cannery-warehouse combinations that minimize total shipping costs is to 
be determined. 


In addition to linear programming, there are a number of related mathematical 
programming techniques for dealing with similar kinds of problems. One of these is 
dynamic programming, which is concerned with making a sequence of interrelated 
decisions. It is illustrated by a job shop whose workload is subject to considerable 
seasonal fluctuation. However, machine operators are difficult to hire and costly to 
train, so the manager is reluctant to lay off workers during the slack seasons. The 
manager is likewise reluctant to maintain a peak payroll when it is not required. 
Furthermore, the manager is definitely opposed to overtime work on a regular basis. 
Because all work is done to custom orders, it is not possible to build up inventories 
during slack seasons. Therefore, the manager is in a dilemma as to what the policy 
should be regarding employment levels. Estimates are available for the personnel 
requirements during the four seasons of the year for the foreseeable future. Employ- 
ment is not permitted to fall below these levels. Any employment above these levels 
is wasted. The salaries, hiring costs, and firing costs are known, Assuming that 
fractional levels of employment are possible because of a few part-time employees, 
the employment in each season that minimizes the total cost is to be determined. 

Among the probabilistic models considered in Part 4 are some falling into the 
area of queueing (waiting line) theory. A queueing theory model is illustrated by a 
hospital emergency room. The emergency room provides quick medical care for emer- 
gency cases that are brought to the hospital by ambulance or private automobile. At 
any hour there is always one doctor on duty in the emergency room. However, because 
of a growing tendency for emergency cases to use these facilities rather than go to a 
private physician, the hospital has been experiencing a continuing increase in the 
number of emergency room visits each year. As a result, when patients arrive during 
peak usage hours (the early evening), they have to wait until it is their turn to be 
treated by the doctor. Therefore, a proposal has been made that a second doctor should 
be assigned to the emergency room during these hours so that two emergency cases 
can be treated simultaneously. By recognizing that the emergency room is a queueing 
system, several alternative queueing theory models can be applied to predict the 
waiting characteristics of the system with one doctor and with two doctors. These 
models will aid the hospital in its evaluation of the proposal to add a second physician. 

A similar queueing example in a very different context concerns determination 
of the optimal number of repairers for a group of machines. A company uses 10 
identical machines in its production facility. However, because these machines break 
down and require repair frequently, the company has only enough operators to operate 
eight machines at a time, so two machines are available on a standby basis for use 
while other machines are down. Thus eight machines are always operating whenever 
no more than two machines are waiting to be repaired, but the number of operating 
machines is reduced by one for each additional machine waiting to be repaired. The 
probability distribution of the time until any given operating machine breaks down 
and the probability distribution of the time required to repair a machine are known 
from past history. Up until now the company has had just one repairer to repair these 
machines. However, this has frequently resulted in reduced productivity by having 
fewer than eight operating machines. Therefore, consideration is being given to hiring 
a second repairer so that two machines can be repaired simultaneously. Thus the 
queueing system to be studied has the repairers as its servers and the machines re- 
quiring repair as its customers, where the problem is to choose between having one 
or two servers (or possibly more). Given the cost of each repairer and the cost of 
inoperable machines, the optimal number of repairers is to be determined. 


13 


The Nature of 
Operations Research 


14 


Introduction 


Inventory theory is illustrated by a television manufacturing company that pro- 
duces its own speakers, which are used in the production of its television sets. The 
television sets are assembled on a continuous production line at a known monthly 
rate.. The speakers are produced in batches because they do not warrant setting up a 
continuous production line and because relatively large quantities can be produced in 
a short time. The company is. interested in determining when and how many to pro- 
duce. Several costs must be considered. (1) Each time a batch is produced, a setup 
cost is incurred. This cost includes. the cost of ‘‘tooling up;’’ administrative costs, 
record keeping, and so on. (2). The production of speakers in large batch sizes leads 
to a large inventory, resulting in a monthly cost for keeping a speaker in stock. This 
cost includes the cost of capital tied up, storage space, insurance, taxes, protection, 
and so forth. (3) A cost of producing a single speaker (excluding the setup cost) is 
incurred. (4) Company policy prohibits deliberately planning for shortages of any of 
its components. However, a shortage of speakers. occasionally occurs,. resulting in a 
monthly cost for each speaker unavailable when required. This cost includes the cost 
of installing speakers after the television set is fully assembled, storage space, delayed 
revenue, record keeping, and so on. Given data on these costs, the optimal batch size 
(and period between production) is to be determined. 

The use of Markovian decision processes can be described in terms of a pro- 
duction process that contains a machine that deteriorates rapidly in both quality and 
output under heavy usage, so that it is inspected. at the end of each day. Immediately 
after inspection, the condition of the machine is noted and classified into one of four 
possible states: 0 (as good as new), 1 (operable—minor deterioration), 2 (operable— 
major deterioration), and 3 (inoperable—output of unacceptable quality). The state of 
the system is assumed to evolve according to some known probabilistic ‘‘laws of 
motion.’ At the end of each day, one of three decisions can be made: (1) leave the 
machine. alone; (2) overhaul the machine, which results in leaving it operable with 
minor deterioration; and (3) replace. it, which results in. a new machine. As a result 
of the state of the system found at the end of the day and the decision taken, a cost 
is incurred. Given these costs and a description of the probabilistic ‘‘laws of motion,’’ 
an optimal maintenance policy is to. be found. 


1.6 Algorithms and OR COURSEWARE 


An important part of this book is the presentation of the major algorithms (iterative 
solution procedures) of operations research for solving the types of problems described 
in the previous section. Some. of these algorithms are amazingly efficient and are 
routinely used on problems involving hundreds or thousands of variables. Outside the 
classroom, they normally are executed on computers because of the relatively exten- 
sive numerical calculations involved. 

To aid the student in learning these. algorithms, personal software (entitled OR 
COURSEWARE) is packaged in the back of the book. Separate diskettes are available 
for either an IBM (or-IBM-compatible) personal computer with a graphics card or a 
Macintosh. (For an IBM personal computer that takes a 33-inch diskette, the Instruc- 
tor’s Guide, which is available to the teacher, will contain information on how to 
obtain equivalent diskettes of this size that can be copied.) Although use of this 
software will result in an enhancement of the textbook, it is not essential for the 
student to have access to a microcomputer to comprehend the material presented in 
the book. 


Three types of routines are included in the software. One is demonstration 
examples that display and explain the algorithms in action. These ‘‘demos’’ supple- 
ment the examples in the book. 

For doing homework problems, the second type of routine—interactive exe- 
cution of algorithms—commonly will be used. The computer does all the routine 
calculations while the student focuses on learning and executing the logic of the 
algorithm. 

A third type occasionally available is routines for automatic execution of algo- 
rithms. This type will be used to test the student’s ability to formulate models and to 
perform subsequent analysis much as practitioners do with the output of production 
codes. 

The OR COURSEWARE will be described at the end of Sec. 4.3, which is the 
first time it normally would be used. However, some students may wish to begin 
getting acquainted with it now, including reading the introduction that comes on the 
screen when a diskette is inserted into the disk drive. This introduction, and subsequent 
instructions, will guide the student through the complete use of the software. 
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The bulk of this book is devoted to the mathematical methods of operations research. 
This is quite appropriate because these quantitative techniques form the main part of 
what is known about operations research. However, it does not imply that practical 
operations research studies are primarily mathematical exercises. As a matter of fact, 
the mathematical analysis often represents only a relatively small part of the total 
effort required. The purpose of this chapter is to place things into better perspective 
by describing all the major phases of a typical operations research study. 

One way of summarizing the usual phases of an operations research study is the 
following!: 


1. Formulating the problem. 
2. Constructing a mathematical model to represent the system under study. 


1 Ackoff, Russell L.: ‘‘The Development of Operations Research as a Science,” Operations Research, 
4:265f, 1956. 
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3. Deriving a solution from the model. 

4. Testing the model and the solution derived from it. 
5. Establishing controls over the solution. 

6. Putting the solution to work: implementation. 


Each of these phases will be discussed in turn in the following sections. 


2.1 Formulating the Problem 
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In contrast to textbook examples, most practical problems are initially communicated 
to an operations research team in a vague, imprecise way. Therefore, the first order 
of business is to study the relevant system and develop a well-defined statement of 
the problem to be considered. This includes determining such things as the appropriate 
objectives, the constraints on what can be done, interrelationships between the area 
to be studied and other areas of the organization, the possible alternative courses of 
action, time limits for making a decision, and so on. This process of problem for- 
mulation is a crucial one because it greatly affects how relevant the conclusions of 
the study will be. It is difficult to extract a ‘‘right’’ answer from the ‘“‘wrong’’ problem! 
Consequently, this phase should be executed with considerable care, and the initial 
formulation should be continually reexamined in the light of new insights obtained 
during the later phases. 

The first thing to recognize is that an operations research team is normally 
working in an advisory capacity. The team members are not just given a problem and 
told to solve it however they see fit. Instead, they are advising management (often 
one key decision maker). The team performs a detailed technical analysis of the 
problem and then presents its recommendations to management. Frequently, the report 
to management will identify a number of alternatives that are particularly attractive 
under different assumptions or over a different range of values of some policy param- 
eter that can be evaluated only by management (e.g., the trade-off between cost and 
benefits). Management evaluates the study and its recommendations, takes into ac- 
count a variety of intangible factors, and makes the final decision based on its best 
judgment. Consequently, it is vital for the operations research team to get on the same 
wavelength as management, including identifying the ‘‘right’’ problem from manage- 
ment’s viewpoint, and to build the support of management for the course that the 
study is taking. 

Determining the appropriate objectives is a very important aspect of problem 
formulation. To do this, it is necessary first to identify the member (or members) of 
management who actually will be making the decisions concerning the system under 
study and then to probe into this individual’s thinking regarding the pertinent objec- 
tives. (Involving the decision maker from the outset also helps to build his or her 
support for the implementation of the study.) After the decision maker’s objectives 
have been elicited, they should be analyzed and edited for the identification of the 
ultimate objectives that encompass the other objectives, the determination of the rela- 
tive importance of these ultimate objectives, and the statement of them precisely in a 
way that does not eliminate worthwhile goals and alternatives. 

By its nature, operations research is concerned with the welfare of the entire 
organization rather than that of only certain of its components. An operations research 
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study seeks solutions that are optimal for the overall organization rather than subop- 
timal solutions that are best for only one component. Therefore, the objectives that 
are formulated should ideally be those of the entire organization. However, this is not 
always convenient to do. Many. problems primarily concern only a portion of the 
organization, so the analysis would become unwieldy if the stated objectives were too 
general and if explicit consideration were given to all side effects on the rest of the 
organization. Granted that operations research takes the viewpoint of the overall or- 
ganization, this does not imply that each problem should be broadened into a study 
of the entire organization. Instead, the objectives used in the study should be as specific 
as they can be while still encompassing the main goals of the decision maker and 
maintaining a reasonable degree of consistency with the higher-level objectives of the 
organization. Side effects on other segments of the organization must then be consid- 
ered only to the extent that there are questions of consistency with these higher-level 
objectives. 

For profit-making organizations, one possible approach to circumventing the 
problem of suboptimization is to use long-run profit maximization as the sole objective. 
The adjective long-run indicates that this objective provides the flexibility to consider 
activities that do not translate into profits immediately (e.g:, research and development 
projects) but need to. do so eventually in order to be worthwhile. At first glance, this 
approach appears to have considerable merit.. In particular, this objective is specific 
enough to be used conveniently, and yet it seems to be broad enough to encompass 
the basic goal of profit-making organizations. In fact, some people believe that all 
other legitimate objectives can be translated into this one. 

However, this is an oversimplification, and considerable caution is required! A 
number of studies of American corporations have found that the goal of satisfactory 
profits, combined with other objectives, is preferred over profit maximization. (In 
fact, inadequate consideration of long-run profits sometimes is cited as a major reason 
why American industry may be losing its competitive edge over that of other leading 
countries.) In particular, typical objectives might be to maintain stable profits, increase 
(or maintain) one’s share of the market, provide for product diversification, maintain 
stable prices, improve worker morale, maintain family control of the business, and 
increase company prestige. Fulfilling these objectives might result in the achievement 
of long-run profit maximization, but the relationship is sufficiently obscure that it may 
not be convenient to incorporate them into this one objective. 

Furthermore, there are additional considerations involving social responsibilities 
that are distinct from the profit motive. The five parties affected by a business firm 
located in a single country are: (1) the owners (stockholders), who desire profits 
(dividends, stock appreciation, etc.); (2) the employees, who desire steady employ- 
ment at reasonable wages; (3). the. customers, who desire a reliable product at a 
reasonable price; (4) the vendors, who desire integrity and a reasonable selling price 
for their goods; and (5) the government and, hence, the nation, which desires payment 
of fair taxes and consideration of the national interest. All five parties make essential 
contributions to the firm, and the firm should not be viewed as the exclusive servant 
of any one party for the exploitation of others. By the same token, international 
corporations acquire additional obligations. to follow socially responsible practices. 
Therefore, although granting that management’s prime responsibility is to make profits 
(which ultimately benefits all five parties), its broader social responsibilities also must 
be recognized. 


2.2 Constructing a Mathematical Model 





After formulating the decision maker’s problem, the next phase is to reformulate this 
problem into a form that is convenient for analysis. The conventional operations 
research approach for doing this is to construct a mathematical model that represents 
the essence of the problem. Before discussing how to formulate such a model, let us 
first explore the nature of models in general and of mathematical models in particular. 

Models, or idealized representations, are an integral part of everyday life. Com- 
mon examples of models include model airplanes, portraits, globes, and so on. Sim- 
ilarly, models play an important role in science and business, as illustrated by models 
of the atom, models of genetic structure, mathematical equations describing physical 
laws of motion or chemical reactions, graphs, organization charts, and industrial 
accounting systems. Such models are invaluable for abstracting the essence of the 
subject of inquiry, showing interrelationships, and facilitating analysis. 

Mathematical models are also idealized representations, but they are expressed 
in terms of mathematical symbols and expressions. Such laws of physics as F = ma 
and E = mc” are familiar examples. Similarly, the mathematical model of a business 
problem is the system of equations and related mathematical expressions that describe 
the essence of the problem. Thus, if there are n related quantifiable decisions to be 
made, they are represented as decision variables (say, x,, X2, ... , X,) whose re- 
spective values are to be determined. The appropriate measure of performance (e.g., 
profit) is then expressed as a mathematical function of these decision variables (e.g., 
P = 3x, + 2x, + ++- + 5x,). This function is called the objective function. Any 
restrictions on the values that can be assigned to these decision variables are also 
expressed mathematically, typically by means of inequalities or equations (e.g., x, + 
3x X_ + 2x, = 10). Such mathematical expressions for the restrictions often are called 
constraints. The constants (coefficients or right-hand sides) in the constraints and the 
objective function are called the parameters of the model. The mathematical model 
might then say that the problem is to choose the values of the decision variables so 
as to maximize the objective function, subject to the specified constraints. Such a 
model, and minor variations of it, typify the models used in operations research. 

You will see numerous examples of mathematical models throughout the re- 
mainder of this book. One particularly important type that is studied in Part 2 is the 
linear programming model, where the mathematical functions appearing in both the 
objective function and the constraints are all linear functions. In the next chapter, 
specific linear programming models are constructed to fit such diverse problems as 
determining (1) the mix of products that maximizes profit, (2) the design of radiation 
therapy that effectively attacks a tumor while minimizing the damage to nearby healthy 
tissue, (3) the allocation of acreage to crops that maximizes total net return, and (4) 
the combination of pollution-abatement methods that achieves air quality standards at 
minimum cost. 

Mathematical models have many advantages over a verbal description of the 
problem. One obvious advantage is that a mathematical model describes a problem 
much more concisely. This tends to make the overall structure of the problem more 
comprehensible, and it helps to reveal important cause-and-effect relationships. In this 
way, it indicates more clearly what additional data are relevant to the analysis. It also 
facilitates dealing with the problem in its entirety and considering all its interrelation- 
ships simultaneously. Finally, a mathematical model forms a bridge to the use of high- 
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powered mathematical techniques and computers. to analyze the problem. Indeed, 
packaged software for both microcomputers: and mainframe computers is. becoming 
widely available for many mathematical models. 

On the other hand, there are. pitfalls to be avoided when using mathematical 
models. Such a model is necessarily an abstract idealization of the problem, so ap- 
proximations and simplifying assumptions generally are required if the model is to be 
tractable (capable of being solved). Therefore, care must be taken to ensure that the 
model remains a valid representation of the problem. The proper criterion for judging 
the validity of a model is whether or not the model predicts the relative effects of the 
alternative courses of action with sufficient accuracy to permit a sound decision. 
Consequently, it is not necessary to include unimportant details or factors that have 
approximately the same effect for all the alternative courses of action considered. It 
is not even necessary that the absolute magnitude of the measure of performance be 
approximately correct for the various alternatives, provided that their relative values 
(i.e., the differences between their values) are sufficiently precise. Thus all that is 
required is that there be a high correlation between the prediction by the model and 
what would actually happen in the real world. To ascertain whether this requirement 
is satisfied, it is important to do considerable testing and consequent modifying of the 
model, which will be the subject of Sec. 2.4. Although this testing phase is placed 
later in the chapter, much of this model. validation work actually is conducted during 
the model-building phase of the study to help guide the construction of the mathe- 
matical model. 

When developing the model, a good approach is to begin with a very simple 
version and then move in evolutionary fashion toward more elaborate models that 
more nearly reflect the complexity of the real problem. This process of model enrich- 
ment continues only as long as the model remains tractable. The basic trade-off under 
consideration is between the precision and the tractability of the model. (See Selected 
Reference 6 for a detailed description of this process.) 

A crucial step in formulating the mathematical model is constructing the objec- 
tive function. This requires developing a quantitative measure of performance relative 
to each objective that has been formulated for the study. If there are multiple objec- 
tives, their respective measures commonly are then transformed and combined into a 
composite measure called the overall measure of performance. This overall measure 
might be something tangible (e.g., profit), corresponding to a higher goal of the 
organization, or it might be abstract (e.g., ‘‘utility’’). In the latter case, the task of 
developing this measure tends to be a complex one requiring a careful comparison of 
the objectives and their relative importance. After developing the overall measure of 
performance, the objective function is then obtained by expressing this measure as a 
mathematical function of the decision variables. Alternatively, there also are methods 
for explicitly considering multiple objectives simultaneously, and one of these (goal 
programming) is discussed in Chap. 8. 


2.3 Deriving a Solution 


After formulating a mathematical model for the problem under consideration, the next 
phase in an operations research study is to derive a solution from this model. You 
might think that this must be the major part of the study, but actually it is not in most 
cases. Sometimes, in fact, it is a relatively simple step, in which one of the standard 


algorithms (iterative solution procedures) of operations research is applied on a com- 
puter by using one of a number of readily available software packages. For experienced 
operations research practitioners, finding a solution is the ‘‘fun part,’’ whereas the 
real work comes in the preceding and following steps, including the post-optimality 
analysis discussed later in this section. 

Since much of this book is devoted to the subject of how to obtain solutions for 
various important types of mathematical models, little needs to be said about it here. 
However, we do need to discuss the nature of such solutions. 

A common theme in operations research is the search for an optimal, or best, 
solution. Indeed, many procedures have been developed, and are presented in this 
book, for finding such solutions for certain kinds of problems. However, it needs to 
be recognized that these solutions are optimal only with respect to the model being 
used. Since the model necessarily is an idealized rather than an exact representation 
of the real problem, there cannot be any Utopian guarantee that the optimal solution 
for the model will prove to be the best possible solution that could have been imple- 
mented for the real problem. There just are too many imponderables and uncertainties 
associated with real problems. However, if the model is well formulated and tested, 
the resulting solution should tend to be a good approximation to the ideal course of 
action for the real problem. Therefore, rather than be deluded into demanding the 
impossible, the test of the practical success of an operations research study should be 
whether it provides a better guide for action than can be obtained by other means. 

The eminent management scientist and Nobel Laureate in Economics, Herbert 
Simon, points out that satisficing is much more prevalent than optimizing in actual 
practice. In coining the term satisficing as a combination of the words satisfactory 
and optimizing, Simon is describing the tendency of managers to seek a solution that 
is ‘‘good enough’’ for the problem at hand. Rather than trying to develop an overall 
measure of performance to optimally reconcile conflicts. between various desirable 
objectives (including well-established criteria for judging the performance of different 
segments of the organization), a more pragmatic approach may be used. Goals may 
be set to establish minimum satisfactory levels of performance in various areas, based 
perhaps on past levels of performance or on what the competition is achieving. If a 
solution is found that enables all of these goals to be met, it is likely to be adopted 
without further ado. Such is the nature of satisficing. 

The distinction between optimizing and satisficing reflects the difference be- 
tween theory and the realities frequently faced in trying to implement that theory in 
practice. In the words of one of England’s OR leaders, Samuel Eilon, ‘‘optimizing is 
the science of the ultimate; satisficing is the art of the feasible. ””! 

Operations research teams attempt to bring as much of the ‘‘science of the 
ultimate’’ as possible to the decision-making process. However, the successful team 
does so in full recognition of the overriding need of the decision maker to obtain a 
satisfactory guide for action in a reasonable period of time. Therefore, the goal of an 
operations research study should be to conduct the study in an optimal manner, re- 
gardless of whether this involves finding an optimal solution to the model or not. 
Thus, in addition to pursuing the “‘science of the ultimate,’’ the team should also 
consider the cost of the study and the disadvantages of delaying its completion, and 
then attempt to maximize the net benefits resulting from the study. In recognition of 


1 Eilon, Samuel: ‘‘Goals and Constraints in Decision-making,’’ Operational Research Quarterly, 23:3-15, 
1972; address given at the 1971 Annual Conference of the Canadian Operational Research Society. 


21 


Overview of the 
Operations Research 
Modeling Approach 


22 


Introduction 


this concept, operations research teams occasionally use only heuristic procedures 
(i.e., intuitively designed procedures that do not guarantee an optimal solution) to 
find a good suboptimal solution. This is most often the case when the time or cost 
required to find an optimal solution for an: adequate model of the problem would be 
very large. 

The discussion thus far has implied that an operations research study seeks to 
find only one solution, which may or may not be required to be optimal. In fact, this 
usually is not the case. An optimal solution for the original model may be far from 
ideal for the real problem. Therefore, post-optimality analysis is a very important 
part of most operations research studies. 

In part, post-optimality analysis involves conducting sensitivity analysis to de- 
termine which parameters of the model are most critical (the ‘‘sensitive parameters’’) 
in determining the solution. Some or all of the parameters generally are an estimate 
of some quantity (e.g., unit profit) whose exact value will become known only after 
the solution has been implemented. Therefore, after identifying the sensitive param- 
eters, special attention is given to estimating each one more closely, or at least its 
range of likely values. One then seeks a solution that remains a particularly good one 
for all of the various combinations of likely values of the sensitive parameters. 

In some cases, certain parameters. of the model represent policy decisions (e.g., 
resource allocations). If so, there frequently is some flexibility in the values assigned 
to these parameters. Perhaps some can be increased by decreasing others. Post-opti- 
mality analysis includes the investigation of such trade-offs. 

In conjunction with the study phase discussed in the next section (testing the 
model and the solution), post-optimality analysis also involves obtaining a sequence 
of solutions that comprises a series of improving approximations to the ideal course 
of action. Thus the apparent weaknesses in the initial solution are used to suggest 
improvements in the model, its input data, and perhaps the solution procedure. A new 
solution is then obtained, and the cycle is repeated. This process continues until the 
improvements in the succeeding solutions become too small to warrant continuation. 
Even then, a number of alternative solutions. (perhaps solutions that are optimal for 
one of several plausible versions of the model and its input data) may be presented 
to management for the final selection. As suggested in Sec. 2.1, this presentation of 
alternative solutions would normally be done whenever the final choice among these 
alternatives should be based on considerations that are best left to the judgment of 
management. 

Ways in which the model and its solution are evaluated and improved will be 
discussed in the next section. 


2.4 Testing the Model and the Solution 


One of the first lessons of operations research is that it is generally not sufficient to 
rely solely on one’s intuition. This caution applies not only in obtaining a solution to 
a problem but also in evaluating the model that has been formulated to represent this 
problem. As indicated in Sec. 2.2, the proper criterion for judging the validity of a 
model is whether or not it predicts the relative effects of the alternative courses of 
action with sufficient accuracy to permit a sound decision. No matter how plausible 
the model may appear to be, it should not be accepted on faith that this condition is 


satisfied. Given the difficulty of communicating and understanding all the aspects and 
subtleties of a complex operational problem, there is a distinct possibility that the 
operations research team either has not been given all the true facts of the situation 
or has not interpreted them properly. For example, an important factor or interrela- 
tionship may not have been incorporated into the model or perhaps certain parameters 
have not been estimated accurately. 

Before undertaking more elaborate tests, check for obvious errors or oversights 
in the model. Reexamining the formulation of the problem and comparing it with the 
model may help to reveal any such mistakes. Another useful check is to make sure 
that all the mathematical expressions are dimensionally consistent in the units they 
use. Additional insight into the validity of the model can sometimes be obtained by 
varying the parameters and/or the decision variables and checking to see whether the 
output from the model behaves in a plausible manner. This is often especially revealing 
when the parameters or variables are assigned extreme values near their maxima or 
minima. 

A more systematic approach to testing the model is to use a retrospective test. 
When it is applicable, this test involves using historical data to reconstruct the past 
and then determining how well the model and the resulting solution would have 
performed if it had been used. Comparing the effectiveness of this hypothetical per- 
formance with what actually happened then indicates whether using this model tends 
to yield a significant improvement over current practice. It may also indicate areas 
where the model has shortcomings and requires modifications. Furthermore, by using 
alternative solutions from the model and determining their hypothetical historical per- 
formances, considerable evidence can be gathered regarding how well the model 
predicts the relative effects of alternative courses of action. 

On the other hand, a disadvantage of retrospective testing is that it uses the 
same data that guided the formulation of the model. The crucial question is whether 
or not the past is truly representative of the future. If it is not, then the model might 
perform quite differently in the future than it would have in the past. 

To circumvent this disadvantage of retrospective testing, it is sometimes useful 
to continue the status quo temporarily. This provides new data that were not available 
when the model was constructed. These data are then used in the same ways as those 
described here to evaluate the model. 

If the final model is to be used repeatedly, it is important to continue checking 
the model and its solution after the initial implementation to make sure that they 
remain valid as conditions evolve over time. The establishment of such controls is 
the subject of the next section. 


2.5 Establishing Control Over the Solution 


What happens after the testing phase has been completed and an acceptable model 
has been developed? If the model is to be used repeatedly, the next step is to install 
a well-documented system for applying the model. This system would include the 
model, the solution procedure (including post-optimality analysis), and operating pro- 
cedures for implementation. Then, even as personnel changes, the system can be 
called on at regular intervals to provide a specific numerical solution. 

It is evident that this solution remains valid for the real problem only as long 
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as this specific model remains: valid: However, conditions are constantly changing in 
the real world.. Therefore, changes might well occur that would invalidate this model; 
e.g., the values of the parameters might change significantly. If these values should 
change, it is vital that the change be detected as soon as possible so that the model, 
its solution, and the resulting course of action can be modified accordingly. A plan 
for detecting such changes and making the needed modifications should be part of the 
system for applying the model. 

This plan will include provision for maintaining a general surveillance of the 
situation. In addition, it is often worthwhile to establish systematic procedures for 
detecting change and controlling the solution. To do this, it is necessary to identify 
the sensitive parameters of the model by. sensitivity analysis, as discussed in Sec. 
2.3. Next, a procedure is established for detecting statistically significant changes in 
each of these sensitive parameters. This procedure can sometimes be established by 
the process control charts used in statistical quality control. Finally, provision is made 
for adjusting the solution and consequent course of action whenever such a change is 
detected. 


2.6 Implementation 


The last phase of an operations research study is to implement: the final solution as 
approved by the decision maker. This phase is a critical one because it is here, and 
only here, that the benefits of the study are reaped. Therefore, it is important for the 
operations research team to participate in launching this. phase, both to make sure that 
the solution is accurately translated into. an operating procedure and to rectify any 
flaws in the solution that are then uncovered. 

The success of the implementation phase depends a great deal upon the support 
of both top management and operating management. Consequently, as mentioned in 
Secs. 2.1 and 2.4, the operations research team should encourage the active partici- 
pation of management in formulating the problem and evaluating the solution. Ob- 
taining the guidance of management is valuable in its own right for identifying relevant 
special considerations and thereby avoiding potential pitfalls during these phases. 
However, making management a party to the study also’ serves to enlist their active 
support for its implementation. 

The implementation phase involves several steps. First, the operations research 
team gives operating management a careful explanation of the solution to be adopted 
and how it relates to operating realities. Next, these two parties share the responsibility 
for developing the procedures required to put this solution into operation. Operating 
management then sees that a detailed indoctrination is given to the personnel involved, 
and the new course of action is initiated. If successful, the model. and the solution 
procedure may be used periodically to provide guidance to management. With this in 
mind, the operations research team monitors the initial experience with the course of 
action taken and seeks to identify any modifications that should be made in the future. 

Upon culminating a study, it is appropriate for the operations research team to 
document its methodology clearly and accurately enough so that the work is repro- 
ducible. Replicability should be part of the professional ethical code of the operations 
researcher. This condition is especially crucial when controversial public policy issues 
are being studied. 


2.7 Conclusions 


Although the remainder of this book focuses primarily on constructing and solving 
mathematical models, we have tried to emphasize in the present chapter that this 
constitutes only a portion of the overall process involved in conducting a typical 
operations research study. The other phases described here also are very important to 
the success of the study. Try to keep in perspective the role of the model and the 
solution procedure in the overall process as you move through the subsequent chapters. 
Then, after gaining a deeper understanding of mathematical models, we suggest 
that you plan to return to review this chapter again in order to further sharpen this 
perspective. 

Many of the phases discussed in this chapter entail the use of software tools, 
including decision support systems. Operations research is closely intertwined with 
the use of computers. Until recently, these have been almost exclusively mainframe 
computers, but now microcomputers also are being widely used for dealing with 
smaller problems. 

In concluding this discussion of the major phases of an operations research study, 
it should be emphasized that there are many exceptions to the ‘‘rules’’ prescribed in 
this chapter. By its very nature, operations research requires considerable ingenuity 
and innovation, so it is impossible to write down any standard procedure that should 
always be followed by operations research teams. Rather, the preceding description 
may be viewed as a model that roughly represents how successful operations research 
studies are conducted. 
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Introduction to Linear 
Programming 


Many people rank the development of linear programming among the most important 
scientific advances of the mid-twentieth century, and we must agree with this assess- 
ment. Its impact since just 1950 has been extraordinary. Today it is a standard tool 
that has saved many thousands or millions of dollars for most companies or businesses 
of even moderate size in the various industrialized countries of the world, and its use 
in other sectors of society has been spreading rapidly. Dozens of textbooks have been 
written about the subject, and published articles describing important applications now 
number in the hundreds. A very major proportion of all scientific computation on 
computers is devoted to the use of linear programming. 

What is the nature of this remarkable tool, and what kinds of problems does it 
address? You will gain insight into this as you work through subsequent examples. 
However, a verbal summary may help provide perspective. Briefly, the most common 
type of application involves the general problem of allocating limited resources among 
competing activities in the best possible (i.e., optimal) way. This problem of allocation 
can arise whenever one must select the level of certain activities that compete for 
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scarce resources necessary to perform those activities. The variety of situations to 
which this description applies is diverse indeed, ranging from the allocation of pro- 
duction facilities to products to the allocation of national resources to domestic needs, 
from portfolio selection to the selection of shipping patterns, from agricultural plan- 
ning to the design of radiation therapy, and so on. However, the one common ingre- 
dient in each of these situations is the necessity for allocating resources to activities. 

Linear programming uses a mathematical model to describe the problem of 
concern. The adjective linear means that all the mathematical functions in this model 
are required to be linear functions. The word programming does not refer here to 
computer programming; rather, it is essentially a synonym for planning. Thus linear 
programming involves the planning of activities to obtain an optimal result, i.e., a 
result that reaches the specified goal best (according to the mathematical model) among 
all feasible alternatives. 

Although allocating resources to activities is the most common type of appli- 
cation, linear programming has numerous other important applications as well. In fact, 
any problem whose mathematical model fits the very general format for the linear 
programming model is a linear programming problem. Furthermore, a remarkably 
efficient solution procedure, called the simplex method, is available for solving linear 
programming problems of even enormous size. These are some of the reasons for the 
tremendous impact of linear programming in recent decades. 

Because of its great importance, we devote this and the next six chapters spe- 
cifically to linear programming. After this chapter introduces the general features of 
linear programming, Chaps. 4 and 5 focus on the simplex method. Chapter 6 discusses 
the further analysis of linear programming problems after the simplex method has 
been initially applied. Chapter 7 considers several special types of linear programming 
problems whose importance warrants individual study. Chapter 8 then concentrates 
on the formulation of linear programming models. Finally, Chap. 9 presents several 
widely used extensions of the simplex method, and then introduces a new interior- 
point algorithm that sometimes can solve even larger linear programming problems 
than the simplex method. 

You also can look forward to seeing applications of linear programming to other 
areas of operations research in several later chapters. 

We begin this chapter by developing a miniature prototype example of a linear 
programming problem. This example is small enough to be solved graphically in a 
straightforward way. We then present the general linear programming model and its 
basic assumptions. The chapter concludes with some additional examples of linear 
programming applications. 


3.1 Prototype Example 


The WYNDOR GLASS CO. is a producer of high-quality glass products, including 
windows and glass doors. It has three plants. Aluminum frames and hardware are 
made in Plant 1, wood frames are made in Plant 2, and Plant 3 is used to produce 
the glass and assemble the products. 

Because of declining earnings, top management has decided to revamp the 
product line. Several unprofitable products are being discontinued, and this act will 
release production capacity to undertake one or both of two potential new products 
that have been in demand. One of these proposed products (product 1) is an 8-foot 


Table 3.1 Data for Wyndor 
Glass Co. 


Capacity Used Per Unit 
Production Rate 












Product Capacity 

Plant 1 2. Available 
1 1 0 4 
2 0 2 12 
3 3 18 





Unit profit 


glass door with aluminum framing. The other product (product 2) is a large (4 x 6 
foot) double-hung wood-framed window. The Marketing Department has concluded 
that the company could sell as much of either product as could be produced with the 
available capacity. However, because both products would be competing for the same 
production capacity in Plant 3, it is not clear which mix between the two products 
would be most profitable. Therefore, management asked the Operations Research 
Department to study this question. 

After some investigation, the OR Department determined (1) the percentage of 
each plant’s production capacity that would be available for these products, (2) the 
percentages required by each product for each unit produced per minute, and (3) the 
unit profit for each product. This information is summarized in Table 3.1. 

The OR Department immediately recognized that this was a linear programming 
problem of the classic product mix type, and it next undertook the formulation and 
solution of the problem. 


FORMULATION AS A LINEAR PROGRAMMING PROBLEM: To formulate the math- 
ematical (linear programming) model for this problem, let x, and x, represent the 
number of units produced per minute of products 1 and 2, respectively, and let Z be 
the resulting contribution to profit per minute. Thus x, and x, are the decision variables 
for the model. Using the bottom row of Table 3.1, 


Z = 3x, + 5%. 


The objective is to choose the values of x, and x, so as to maximize Z = 3x, + 5x3, 
subject to the restrictions imposed on their values by the limited plant capacities 
available. Table 3.1 implies that each unit of product 1 produced per minute would 
use 1 percent of Plant 1 capacity, whereas only 4 percent is available. This restriction 
is expressed mathematically by the inequality x, = 4. Similarly, Plant 2 imposes the 
restriction that 2x, = 12. The percentage of Plant 3 capacity consumed by choosing 
x, and x, as the new products’ production rates would be 3x, + 2x,. Therefore, the 
mathematical statement of the Plant 3 restriction is 3x, + 2x, = 18. Finally, since 
production rates cannot be negative, it is necessary to restrict the decision variables 
to be nonnegative: x, = 0 and x, = 0. 

To summarize, in the mathematical language of linear programming, the prob- 
lem is to choose the values of x, and x, so as to 


Maximize Z = 3x, + 5x, 
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subject to the restrictions 


IA 


4 
2x, = 12 


x) 


3x, + 2x, = 18 
and x, 20, x, = 0. 


(Notice how the layout of the coefficients of x, and x, in this linear programming 
model essentially duplicates the information summarized in Table 3.1.) 


GRAPHICAL SOLUTION: This very small problem has only two decision variables, 
and therefore only two dimensions, so a graphical procedure can be used to solve it. 
This procedure involves constructing a two-dimensional graph with x, and x, as the 
axes. The first step is to identify. the values of (x,, x2) that are permitted by the 
restrictions. This is done by drawing the lines that must border the range of permissible 
values. To begin, note that the nonnegativity restrictions, x, = 0 and x, = 0, require 
(x), X2) to lie on the positive side of the axes (including actually on either axis). Next, 
observe that the restriction x, = 4 means. that (x,, x,) cannot lie. to. the right of the 
line x, = 4. These results are shown in Fig. 3.1, where the shaded area contains the 
only values of (x,, x2) that are still allowed. 

In a similar fashion, the restriction 2x, = 12 implies that the line 2x, = 12 
should be added to the boundary. of the permissible region. The final restriction, 
3x, + 2x, = 18, requires plotting the points. (xı, x2) such that 3x, + 2x, = 18 
(another line) to complete the boundary. (Note that the points such that 3x, + 2x, = 
18 are those that lie either underneath or on the line 3x, + 2x, = 18, so this is the 
limiting line beyond which the inequality ceases to hold.):The resulting region of 
permissible values of (x,, x3) is shown in Fig. 3.2. 

The final step is to pick out the point in this region that maximizes the value of 
Z = 3x, + 5x,. To discover how to perform this step efficiently, begin by trial and 
error. Try, for example, Z = 10 = 3x; + 5x, to see if there are in the permissible 
region any values of (x,, x2) that yield a value of Z as large as 10. By drawing the 
line 3x, + 5x, = 10 (see Fig. 3.3), you can see that there are many points on this 
line that lie within the region. Therefore, try a larger value. of Z, say, for example, 
Z = 20 = 3x, + 5x. Again, Fig. 3.3 reveals that a segment of the line 3x, + 
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Figure 3.1 Shaded area shows values of (x,, x,) allowed by x, = 0, x, = 0, x, = 4. 

















0 2 4 6 8 X] 


Figure 3.2 Shaded areas show permissible values of (x,, x2). 


5x, = 20 lies within the region, so that the maximum permissible value of Z must be 
at least 20. 

Now notice in Fig. 3.3 that the two lines just constructed are parallel, and that 
the line giving a larger value of Z (Z = 20) is farther up and away from the origin 
than the other line (Z = 10). Thus this trial-and-error procedure involves nothing 


X24 


BE 
Z = 36 = 3x, + 5x 










Z =20 = 3x, + 5x2 
4 


Z =10 = 3x, + 5x3 
2 








0 2 4 6 8 10 xı 


Figure 3.3 Value of (x,, x) that maximizes 3x, + 5x. 
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more than drawing a family of parallel lines containing at least one point in. the 
permissible region and selecting the line that is the greatest distance from the origin 
(in the direction of increasing values of Z). This line passes through the point (2, 6) 
as indicated in Fig. 3.3, so that the equation is 3x, + 5x, = 3(2) + 5(6) = 36 = 
Z. [Note that the point (2, 6) lies at the intersection of the two lines, 2x, = 12 and 
3x, + 2x, = 18, shown in Fig. 3.2, so that this point can be calculated algebraically 
as the simultaneous solution of these two equations. ] 

Having seen the trial-and-error procedure for finding (2, 6), you now can stream- 
line this approach for other problems. Rather than drawing several parallel lines, it is 
sufficient to form a single line with a ruler to establish the slope. Then move the ruler 
with fixed slope through the region of permissible values in the direction of improving 
Z. (When the objective is to minimize Z, move the ruler in the direction that decreases 
Z.) Stop moving the ruler at the last instant that it still passes through a point in this 
region. This point is the desired solution. 


ConcLusions: The OR Department used this approach to find that the desired 
solution is x, = 2, x, = 6, with Z = 36. This solution indicates that the Wyndor 
Glass Co. should produce products | and 2 at the rate of two per minute and six per 
minute, respectively, with a resulting profitability of $36/minute. No other mix of 
the two products would be so profitable—according to the model: 

However, we emphasized in Chap. 2 that well-conducted operations research 
studies do not simply find one solution for the initial model formulated and stop. All 
six phases described in Chap. 2 are important, including thorough testing of the model 
(see Sec. 2.4) and post-optimality analysis (see Sec. 2.3). 

In full recognition of these practical realities, the OR Department now is ready 
to evaluate the validity of the model more critically (to be continued in Sec. 3.3) and 
to perform sensitivity analysis on the effect of the estimates in Table 3.1 being in- 
accurate (to be continued in Sec. 6.7). 


3.2 The Linear Programming Model 


The Wyndor Glass Co. problem nicely illustrates a typical linear programming prob- 
lem (miniature version). However, linear programming is too versatile to be com- 
pletely described by any single example. In this section we discuss the general char- 
acteristics of linear programming problems, including the various legitimate forms of 
the mathematical model for linear programming. 

Let us begin with some basic terminology and notation. The first column of 
Table 3.2 summarizes the components of the Wyndor Glass Co. problem. The second 
column then introduces more general terms for. these same components that will fit 


Table 3.2 Common Terminology for Linear Programming 


Prototype Example General. Problem 


Production capacities of plants Resources 
3 plants m resources 
Production of products Activities 

2 products n activities 


Production rate of product j (4;) 
Profit (Z) 


Level of activity j (x) 





Overall measure of performance (Z) 


Table 3.3 Data for Linear Programming Model 


Resource Usage Per Unit of Activity 












Activity Amount of 
Resource 1 2 es n Resource Available 
1 ay a2 ii ain b, 
2 az a2 ei An b, 
m b, 








AZ/unit of activity 
Level of activity 


most linear programming problems. The key terms are resources and activities, where 
the number of each is denoted by m and n, respectively. As described in the intro- 
duction to the chapter, the resources are needed to perform these activities, but the 
amount available of each resource is limited, so a careful allocation of resources to 
activities must be made. Determining this allocation involves choosing the levels of 
the activities (the values of the decision variables) that achieve the best possible value 
of the overall measure of performance Z. 

The standard notation of linear programming is summarized in Table 3.3. For 
activity j (J = 1, 2, .. . , n), c; is the increase in Z that would result from each unit 
of increase in x; (the level of activity j). For resource i (i = 1, 2, . . . , m), b; is the 
amount available for allocation to the activities. Finally, a; is the amount of resource 
i consumed by each unit of activity j (fori = 1,2,...,mandj = 1,2,...,7n). 
This set of data (the a;;, b; and c;) constitutes the parameters (input constants) of 
the linear programming model. 

Notice carefully the complete correspondence between Table 3.3 (except for the 
extra row added at the bottom) and Table 3.1. 


A Standard Form of the Model 


Proceeding just as for the example, we can now formulate the mathematical model 
for this general problem of allocating resources to activities. In particular, this model 
is to select the values for x,, x2, . . . , x, (the decision variables) so as to 


Maximize Z = CX, + CX t+ + OX 


subject to the restrictions 


aX + aX +t + ainn = bi 

AnyX1 Hapy + +++ + ay, Xx, = by 

Qn iX] F Am2¥2 AN a mnn = bss 
and x, 20, xX, = 0, nig x, = 0. 


We call this our standard form' for the linear programming problem. Any situation 
whose mathematical formulation fits this model is a linear programming problem. 


1 This is called our standard form rather than the standard form because some textbooks adopt other forms. 
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Notice that the model for the Wyndor Glass Co. problem fits our standard form, 
with m = 3 andn = 2. 
Common terminology for the linear programming model can now be summa- 


rized. The function being maximized, cx} + cox. + +++ + ¢,X,, is called the 
objective function. The restrictions normally are referred to as constraints. The first 
m constraints (those with a function a,x, + apx, + +--+ + a;,X,, representing the 


total usage of resource i, on the left) are sometimes called functional constraints. 
Similarly, the x; = 0 restrictions are called nonnegativity constraints. 


Other Forms 


We now hasten to add that the preceding model does not actually fit the natural form 
of some linear programming problems. The other legitimate forms are the following: 


1. Minimizing rather than. maximizing the objective function: 


Minimize Z = CX, + CnX. + +++ + CX 


nn 


2. Some functional constraints with-a greater-than-or-equal-to inequality: 
ali + ApX, + +++ + a,x, = Da for some values of i, 
3. Some functional constraints in equation form: 


AyX, + AX) + tt + AyX, = b; for some values of i, 


n“ n 


4. Deleting the nonnegativity constraints for some decision variables: 
x; unrestricted in sign, for some values of j. 


Any problem that mixes some or all of these forms with the remaining parts of the 
preceding model is still a linear programming problem. Our interpretation of allocating 
limited resources among competing activities may no longer apply very well, if at all, 
but regardless of the interpretation or context, all that is required is that the mathe- 
matical statement of the problem fit the allowable forms. 

In Sec. 4.6 you will see that all these other four legitimate forms can be rewritten 
in an equivalent way to fit the model just discussed. Thus. every linear programming 
problem can be put into our standard form if so desired. We shall take advantage of 
this fact everywhere that procedures for solving linear programming problems are 
discussed (except Sec. 4.6) by assuming that the problems are in our standard form. 


Terminology for Solutions of the Model 


You may be used to having the term solution mean the final answer to a problem, 
but the convention in linear programming (and its extensions) is quite different. Here, 
any specification of values for the decision variables (x,, x2, . . . , X„) is called a 
solution, regardless of whether it is a desirable or even an allowable choice. Different 
types of solutions are then identified by using an appropriate adjective. 


A feasible solution is a solution for which all the constraints are satisfied. 


In the example, (2, 3) and (4, 1) in Fig. 3.2 are feasible solutions, but (— 1, 3) and 
(4, 4) are infeasible solutions. 


The feasible region is the collection of all feasible solutions. 


The feasible region in the example is the entire shaded area in Fig. 3.2. 

It is possible for a problem to have no feasible solutions. This would have 
happened in the example if the new products had been required to return a net profit 
of at least $50/minute to justify discontinuing part of the current product line. The 
corresponding constraint, 3x, + 5x, = 50, would eliminate the entire feasible region, 
so no mix of new products would be superior to the status quo. 

Given that there are feasible solutions, the goal of linear programming is to find 
which one is best, as measured by the value of the objective function in the model. 


An optimal solution is a feasible solution that has the most favorable value of the 
objective function. 


Most favorable value means the largest or smallest value, depending upon whether 
the objective is maximization or minimization.. Thus an optimal solution maxi- 
mizes/minimizes the objective function over the entire feasible region. 

Most problems will have just one optimal solution. However, it is possible to 
have more than one. This would occur in the example if the unit profitability of 
product.2. were changed. to. $2, thereby changing the objective function to Z = 
3x,.+. 2x2, so that all the points on the line. segment connecting (2, 6) and (4, 3) 
would be optimal. As in this case, any problem having multiple optimal solutions will 
have an infinite number. of them. 

Another possibility is that a problem has no optimal solutions. This occurs only 
if (1) it has no feasible solutions or (2) the constraints do not prevent increasing the 
value of the objective function (Z) indefinitely in the favorable direction (positive or 
negative). For example, the latter case would result if the last two functional con- 
straints: were mistakenly deleted in the example. A discussion of how the simplex 
method identifies these unusual cases is included in Secs. 4.5 (for case 2) and 4.6 
(for case 1); we assume until then that they do not arise. 


3.3 Assumptions of Linear Programming 


All the assumptions of linear programming actually are implicit in the model formu- 
lation given in Sec. 3.2. However, it is good to highlight these assumptions so you 
can more easily evaluate how well linear programming applies to any given problem. 
Furthermore, we still need to see why the OR Department of the Wyndor Glass Co. 
concluded that a linear programming formulation provided a satisfactory representation 
of their problem. 


Proportionality 


Proportionality is an assumption about individual activities considered independently 
of the others (whereas the subsequent assumption of additivity concerns the effect of 
conducting activities jointly). Therefore, consider the case where only one of the n 
activities is undertaken. Call it activity k, so that x; = 0 for all j = 1,2,...,” 
except j = k. 

The assumption is that (1) the overall measure of performance Z equals c,x, and 
(2) the usage of each resource i equals a,x,; that is, both quantities are directly 
proportional to the level of each activity k conducted by itself (k = 1,2,..., n) 
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Table 3.4 Examples of Satisfying or Violating Proportionality 
Profit from Product 1 





Proportionality Violated 





Proportionality 
Satisfied 








This implies in particular that there is no extra startup cost associated with beginning 
the activity and that the proportionality holds over the entire range of levels of the 
activity. - 

To illustrate, consider the first term (3x,) in the objective function (Z = 3x, + 
5x>) for the Wyndor Glass Co. problem. This term represents the profit generated per 
minute by producing product 1 at the rate of x; units per minute. The proportionality 
satisfied column of Table 3.4 shows the case that was assumed in Sec. 3.1, namely, 
that this profit is indeed proportional to x, so that 3x, is the appropriate term for the 
objective function. By contrast, the next three columns show different hypothetical 
cases where the proportionality assumption would be violated. 

Case I would arise if there were startup costs associated with initiating the 
production of product 1. For example, there might be costs involved with setting up 
the production facilities. There might also be costs associated with arranging the 
distribution of the new product. Because these are one-time costs, they would need 
to be amortized on a per-minute basis to be commensurable with Z (profit per minute). 
Suppose that this amortization were done and that the total startup cost amounted to 
$1/minute, but that the profit without considering the startup cost would be 3x,. This 
would mean that the contribution from product 1 to Z should be (3x, — 1) for 
x, > 0. This function, which gives the numerical values shown for Case 1, certainly 
is not proportional to x,.! 

At first glance, it might appear that Case 2 in Table 3.4 is quite similar to Case 
1. However, Case 2 actually arises in a very different way. There no longer is a 
startup cost, and the profit from the first unit of product 1 per minute is indeed $3, 
as, originally assumed. However, there now is an increasing marginal return, i.e., 
the increment in Z (AZ) due to incrementing x, by 1 keeps increasing as x, is increased, 
as summarized below: 


AZ = 3 when x, = 0 >x = 15 
AZ =4 when x, = 1 > x, = 2; 
AZ =5 when x, = 2x, = 3. 


This violation of proportionality might occur because of economies that can sometimes 
be achieved at higher levels of production, e.g., through using more efficient high- 
volume machinery, longer production runs, and the learning-curve effect whereby 
workers become more efficient as they gain experience with a particular mode of 


1 If the contribution from product 1 to Z were (3x, — 1) for all x, = 0, including x, = 0, then the fixed 
constant, —1, could be deleted from the objective function without changing the optimal solution, and 
proportionality would be restored. However, this ‘‘fix’’ does not work here because the — 1 constant does 
not apply when x, = 0. 


production. As the incremental cost goes down, the incremental profit will go up 
(assuming constant marginal revenue). 

The reverse of Case 2 is Case 3, where there is a decreasing marginal return, 
as summarized below: 


AZ = 3 when x, = 0 —> x, = 1; 
AZ = 2 when x, = 1 > x, = 2; 
AZ = 1 when x, = 2 >x = 3. 


It 


This violation of proportionality might occur because the marketing costs need to go 
up more than proportionally to attain increases in the level of sales. For example, it 
might be possible to sell product 1 at the rate of one per minute (x, = 1) with no 
advertising, whereas attaining sales to sustain a production rate of x, = 2 might 
require a moderate amount of advertising, and x, = 3 might necessitate an extensive 
advertising campaign. 

All three cases are hypothetical examples of ways in which the proportionality 
assumption could be violated. What is the actual situation? The actual profit from 
producing product 1 (or any other product) is derived from the sales revenue minus 
various direct and indirect costs. Inevitably, some of these cost components are not 
strictly proportional to the production rate, perhaps for one of the reasons illustrated 
above. However, the real question is whether, after cumulating all of the components 
of profit, proportionality is a reasonable approximation for practical modeling pur- 
poses. For the Wyndor Glass Co. problem the OR Department checked both the 
objective function and the functional constraints. The conclusion was that proportion- 
ality could indeed be assumed without serious distortion. 

For other problems, what happens when the proportionality assumption does not 
hold even as a reasonable approximation? In most cases, this means you must use 
nonlinear programming instead (presented in Chap. 14). However, we do point out 
in Sec. 14.8 that a certain important kind of nonproportionality can still be handled 
by linear programming by reformulating the problem appropriately. Furthermore, if 
the assumption is violated only because of startup costs, there is an extension of linear 
programming (mixed integer programming) that can be used, as discussed in Sec. 
13.2 (the fixed-charge problem). 


Additivity 

The proportionality assumption is not enough to guarantee that the objective function 
and constraint functions are linear. Cross-product terms will arise if there are inter- 
actions between some of the activities that would change the value of the overall 
measure of performance or the total usage of some resource. Additivity assumes that 
there are no such interactions between any of the activities, so that there are no cross- 
product terms in the model. 

To be more specific, the additivity assumption (like proportionality) applies to 
both the objective function and the functions on the left-hand side of the functional 
constraints. The latter type of function represents the total usage of some resource. 
For both types of functions, the assumption concerns the comparison between the 
total function value from jointly conducting the activities at their respective levels 
i %),...,%,) and the individual contributions to the function value from con- 
ducting each activity by itself (resetting all other variables to zero). For linear pro- 
gramming, these individual contributions are c;x; for the objective function and a,x; 
for a constraint function. 
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The additivity assumption is that, for each function, the total function value can 
be obtained by adding the individual contributions from the respective activities. 

To make this definition more concrete, and clarify why we need to worry about 
this assumption, let us look at some examples. Table 3.5 shows some possible cases 
for the objective function for the Wyndor Glass Co. problem. In each case, the 
individual contributions from the products are just as assumed in Sec. 3.1, namely, 
3x, for product 1 and 5x, for product 2. The difference lies in the last row, which 
gives the total function value for Z when the two products are produced jointly. The 
additivity satisfied column shows the case where this total function value is obtained 
simply by adding the first two rows (3 + 5 = 8), so that Z = 3x, + 5x, as previously 
assumed. By contrast, the next two columns show hypothetical cases where the ad- 
ditivity assumption would be violated. 

Case 1 corresponds to an objective function of Z = 3x, + 5x, + x,x2, so that 
Z=3+ 5+1 = 9 for (x, x) = (1, 1), thereby violating the additivity assumption 
that Z = 3 + 5. This case would arise if the two products were. complementary in 
some way that increases profit. For example, suppose that a major advertising cam- 
paign would be required to market either new product produced by itself, but that the 
same single campaign can effectively promote both products if the decision is made 
to produce both of them. Because a major cost is saved for the second product, their 
joint profit is somewhat more than the sum of their individual profits when each is 
produced by itself. 

Case 2 also violates the additivity. assumption because of the extra term in 
its objective function, Z = 3x, + 5x, — xX, so that Z= 3 + 5 — 1 = 7 for 
(x, x2) = (1, 1). As the reverse of the first case, Case 2 would arise if the two 
products were competitive in some way that decreases their joint profit. For example, 
suppose that both products would need to use the same machinery and equipment. If 
either product were produced by itself, this machinery and equipment would be dedi- 
cated to this one use. However, producing both products would require switching the 
production. processes back and forth, with substantial time and cost involved in tem- 
porarily shutting down the production of one product and setting up for the other. 
Because of this major extra cost, their joint profit is somewhat less than the sum of 
their individual profits when each is produced by itself. 

The same kinds of interaction between activities can affect the additivity of the 
constraint functions. For example, consider the third functional constraint of the Wyn- 
dor Glass Co. problem, 3x, + 2x, = 18. (This is the only constraint involving both 
products.) This constraint concerns the production capacity of Plant 3, where 18 
percent is available for the two new products, and the function on the left-hand side 
(3x, + 2x,) represents the percentage of the plant’s capacity that would be used by 


Table 3.5 Examples of Satisfying or Violating 
Additivity for the Objective Function 


Value of Z 





Additivity Violated 






Additivity 


(x1, X2) Satisfied 





Table 3.6 Examples of Satisfying or Violating 
Additivity for a Functional Constraint 


Amount of Resource Used 





Additivity Violated 





Additivity 
Satisfied 





i X2) 





these products. The additivity satisfied column of Table 3.6 shows this case as is, 
whereas the next two columns display cases where the function has an extra cross- 
product term that violates additivity. For all three columns, the individual contribu- 
tions from the products toward using the capacity of Plant 3 are just as assumed 
previously, namely, 3x, for product 1 and 2x, for product 2, or 3(2) = 6 for x; = 2 
and 2(3) = 6 for x, = 3. As for Table 3.5, the difference lies in the last row, which 
now gives the total function value for capacity used when the two products are pro- 
duced jointly. 

The capacity used for Case 3 is given by the function, 3x, + 2x, + 0.5x,x, 
so the total function value is 6 + 6 + 3 = 15 when (%, x2) = (2, 3), which violates 
the additivity assumption that the value is just 6 + 6 = 12. This case can arise in 
exactly the same way as described for Case 2: namely, extra time is wasted switching 
the production processes back and forth between the two products. The extra cross- 
product term, 0.5x,x,, would give the amount of capacity used in this way. 

For Case 4, the function for capacity used is 3x, + 2x, — 0.1x3x,, so the total 
function value for (x, x.) = (2, 3) is 6 + 6 — 1.2 = 10.8. This case could arise 
in the following way. Similarly to Case 3, suppose that the two products require the 
same type of machinery and equipment, but suppose now that the time required to 
switch from one product to the other would be relatively small. Because each product 
goes through a sequence of production operations, individual production facilities 
normally dedicated to that product would incur occasional idle periods. During these 
otherwise idle periods, these facilities can be used by the other product, thereby saving 
production capacity. Consequently, the total production capacity used when the two 
products are produced jointly would be less than the sum of the capacities used by 
the individual products when each is produced by itself. 

After analyzing the possible kinds of interaction between the two products il- 
lustrated by these four cases, the OR Department concluded that none played a major 
role in the actual Wyndor Glass Co. problem. Therefore, the additivity assumption 
was adopted as a reasonable approximation. 

For other problems, if additivity is not a reasonable assumption, so that some 
or all of the mathematical functions of the model need to be nonlinear (because of 
the cross-product terms), you definitely enter the realm of nonlinear programming 
(Chap. 14). 


Divisibility 
Sometimes the decision variables have physical significance only if they have integer 
values. However, the optimal solution obtained by linear programming is often a 
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noninteger one. Therefore, the divisibility assumption is that activity units can be 
divided into any fractional levels, so that noninteger values for the decision variables 
are permissible. 

For the Wyndor Glass Co. problem, the decision variables represent production 
rates, which can have fractional values. Certain fractional values are more convenient 
than others because they correspond to integer numbers of people and machines work- 
ing full time on the product. However, the OR Department concluded that these minor 
adjustments could be made easily after using the model to analyze the big picture and 
to identify approximately what the combination of production rates should be. 

Frequently, linear programming is still applied even when an integer solution is 
required. If the solution obtained is a noninteger one, then the noninteger variables 
are merely rounded to integer values. This may be satisfactory, particularly if the 
decision variables are large, but it does have certain pitfalls (discussed in Sec. 13.3). 
If this approach cannot be used, then we are in the realm of integer programming, 
which is the topic of Chap. 13. However, it should be noted that linear programming 
automatically will obtain integer solutions to certain special types of problems, in- 
cluding some of those discussed in Chap. 7. 


Certainty 


The certainty assumption is that all the parameters of the model (the a;, b;, and c; 
values) are known constants. In real problems, this assumption is seldom: satisfied 
precisely. Linear programming models usually are formulated to select some future 
course of action. Therefore, the parameters used would be based on a prediction of 
future conditions, which inevitably introduces some degree of uncertainty. 

For this reason it is usually important to conduct a thorough sensitivity analysis 
after finding a solution that is optimal under the assumed parameter values. As dis- 
cussed in Sec. 2.3, the general purpose is to: identify the sensitive parameters (i.e., 
those that cannot be changed without changing the optimal solution), to try to estimate ` 
these more closely, and then.to select a solution that remains a good one over the 
ranges of likely values of the sensitive parameters. This is what the OR Department 
will do for the Wyndor Glass Co. problem, as you will see in Sec. 6.7. However, it 
is necessary to acquire some more background before finishing that story. 

Occasionally, the degree of uncertainty in the parameters is too great to be 
amenable to sensitivity analysis. In this case, it is necessary to treat the parameters 
explicitly as random variables. Formulations. of this kind have been developed, but 
they are beyond the scope of this book. 


The Assumptions in Perspective 


We emphasized in Sec. 2.2 that a mathematical model. is intended to be only an 
idealized representation of the real problem. Approximations and simplifying as- 
sumptions generally are required in order for the model to be tractable. Adding too 
much detail and precision can make the model too unwieldy for useful analysis of the 
problem. All that is really needed is that there be a reasonably high correlation between 
the prediction of the model and what would actually happen in the real problem. 
This advice certainly is applicable to linear programming. It is very common in 
real applications of linear programming that almost none of the four assumptions. hold 
completely. Except perhaps for the divisibility assumption, minor disparities are to be 


expected. This is especially true for the certainty assumption, so sensitivity analysis 
normally is a must to compensate for the violation of this assumption. 

However, it is important for the operations research team to examine the four 
assumptions for the problem under study and analyze just how large the disparities 
are. If any of the assumptions are violated in a major way, then a number of useful 
alternative models are available, as presented in Part 3 of the book. A disadvantage 
of these other models is that the algorithms available for solving them are not nearly 
as powerful as for linear programming, but this gap has been closing in some cases. 
For some applications, the powerful linear programming approach is used for the 
initial analysis and then a more complicated model is used to refine this analysis. 

As you work through the examples in the next section, you will find it good 
practice to analyze how well each of the four assumptions of linear programming 
applies to these problems. 


3.4 Additional Examples 


The Wyndor Glass Co. problem is a prototype example of linear programming in 
several respects: It involves allocating limited resources among competing activities, 
its model fits our standard form, and its context is the traditional one of improved 
business planning. However, the applicability of linear programming is much wider. 
In this section we begin broadening our horizons. As you study the following ex- 
amples, note that it is their underlying mathematical model rather than their context 
that characterizes them as linear programming problems. Then give some thought to 
how the same mathematical model could arise in many other contexts by merely 
changing the names of the activities and so forth. 

These examples have been kept very small (by linear programming standards) 
for ease of reading. However, much larger versions of the problems, involving 
hundreds of constraints and variables, are readily solvable by linear programming. 


Design of Radiation Therapy 


MARY is a modern success story. She truly has it all—a very successful career, a 
leadership role in her community, many friends and admirers, as well as a loving 
husband and two fine children. But now tragedy has struck. Mary has just been 
diagnosed as having a cancer at a fairly advanced stage. Specifically, she has a large 
malignant tumor in the bladder area (a ‘‘whole bladder lesion’’). 

Her family has arranged for Mary to receive the most advanced medical care 
available in the country in order to give her every possible chance for survival. 
Extensive radiation therapy (in combination with chemotherapy and surgery) is the 
only hope. 

Radiation therapy involves using an external beam treatment machine to pass 
ionizing radiation through the patient’s body, damaging both cancerous and healthy 
tissues. Normally, several beams are precisely administered from different angles in 
a two-dimensional plane. Due to attenuation, each beam delivers more radiation to 
the tissue near the entry point than to the tissue near the exit point. Scatter also causes 
some delivery of radiation to tissue outside the direct path of the beam. Because tumor 
cells are typically microscopically interspersed among healthy cells, the radiation 
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Beam 1 


l. Bladder and 
tumor 

2. Rectum, coccyx, 
etc. 

3. Femur, part of 
pelvis, etc. 


Figure 3.4 Cross 
section of Mary’s 
tumor (viewed from 
above), nearby 
critical tissues, and 
the radiation beams 
being used. 


dosage throughout the tumor region must be large enough to kill the malignant cells, 
which are slightly more radiosensitive, yet small “enough to spare the healthy. cells. 
At the same time, the aggregate dose to critical tissues must not exceed established 
tolerance levels in order to prevent. complications that can be more serious than the 
disease itself. For the same reason, the total dose to the entire healthy anatomy also 
must be minimized. 

Because of the need to carefully balance all of these factors, the design of 
radiation therapy is a very delicate process. The goal of the design is to select the 
combination of beams to be used, and the intensity of each one, in order to. generate 
the best possible dose distribution. (The dose strength at any point in the body is 
measured in units called kilorads.) Once the treatment design has been developed, it 
then is administered in many installments, spread over several weeks. 

In Mary’s case, the size and location of her tumor make the design of her 
treatment an even more delicate process than usual. Figure 3.4 shows a diagram of a 
cross section of the tumor viewed from above, as well as nearby critical tissues to 
avoid. These tissues include critical organs (e.g., the rectum) as well as bony structures 
(e.g., the femurs and pelvis) that will attenuate the radiation. Also shown are the 
entry point and direction for the only two beams that can be used with any modicum 
of safety in this case. (Actually, we are simplifying the example at this point, because 
normally dozens of possible beams must. be considered.) 

For any proposed beam of given intensity, the analysis of what the resulting 
radiation absorption by. various. parts. of the body would be requires a.complicated 
process. In brief, based on careful anatomical analysis, the energy distribution within 
the two-dimensional cross. section of the tissue can: be plotted on an isodose map, 
where the contour lines. represent dose. strength. as.a percentage of the dose strength 
at the entry point. A fine grid then is placed over the isodose map. By summing the 
radiation absorbed in the squares containing each type of tissue, the average dose that 
is absorbed by the tumor, healthy anatomy, and critical tissues can be calculated. 
With more than one beam (administered sequentially), the radiation absorption is 
additive. 

After thorough analysis of this type, the medical team has carefully estimated 
the data needed to design Mary’s treatment, as summarized in Table 3.7. The first 
column lists the areas of the body that must be considered, and then the next two 
columns give the fraction of the radiation dose at the entry point for each beam that 
is absorbed by the respective areas on the average. For example, if the dose level at 
the entry point for beam 1 is 1 kilorad, then an average of 0.4 kilorad will be absorbed 
by the entire healthy anatomy in the two-dimensional plane, an average of 0.3 kilorad 
will be absorbed by nearby critical tissues, an average of 0.5 kilorad will be absorbed 
by the various parts of the tumor, and 0.6 kilorad will be absorbed by the center of 


Table 3.7 Data for Design of Mary’s Radiation Therapy 













Fraction of Entry 
Dose Absorbed by 


L Area (Average) Restriction on Total 











Area Beam 1 Beam 2 Average Dosage 
Healthy anatomy 0.4 0.5 Minimize 
Critical tissues 0.3 0.1 =< 2.7 
Tumor region 0.5 0.5 =6 
Center of tumor 0.6 0.4 =6 





the tumor. The last column gives the restrictions on the total dosage from both beams 
that is absorbed on the average by the respective areas of the body. In particular, the 
average dosage absorption for the healthy anatomy must be as small as possible, the 
critical tissues must not exceed 2.7 kilorads, the average over the entire tumor must 
equal 6 kilorads, and the center of the tumor must be at least 6 kilorads. 

Simultaneously satisfying all of these requirements will be very difficult. The 
medical team has confided to Mary’s family that the disease has reached a critical 
stage where only a slim chance remains for successful treatment. The only chance for 
saving Mary’s life lies with developing the best possible treatment design by using 
the most advanced optimization procedure available, namely (what else), linear pro- 
gramming! 


FORMULATION AS A LINEAR PROGRAMMING PROBLEM: The two decision vari- 
ables, x, and x, represent the dose (in kilorads) at the entry point for beam 1 and 
beam 2, respectively. Because the total dosage reaching the healthy anatomy is to be 
minimized, let Z denote this quantity. The data from Table 3.7 can then be used 
directly to formulate the following linear programming model.! 


Minimize Z = 0.4x, + 0.5x,, 
subject to 0.3x, + O.1x, = 2.7 
0.5x, + 0.5x, = 6 


0.6x, + 0.4x, = 6 
and x,=0, x» 20. 


Notice the differences between this model and the one in Sec. 3.1 for the Wyndor 
Glass Co. problem. The latter model involved maximizing Z, and all of the functional 
constraints were in = form. This new model does not fit this same standard form, 
but it does incorporate three other legitimate forms described in Sec. 3.2, namely, 
minimizing Z, functional constraints in = form, and functional constraints in = form. 

However, both models have only two variables, so this new problem also can 
be solved by the graphical procedure illustrated in Sec. 3.1. Figure 3.5 shows the 
graphical solution. The feasible region consists of just the dark line segment between 
(6, 6) and (7.5, 4.5), because the points on this segment are the only ones that si- 
multaneously satisfy all of the constraints. (Check this.) The dashed line is the ob- 
jective function line that passes through the optimal solution, (x,, x) = (7.5, 4.5) 
with Z = 5.25. This solution is optimal rather than (6, 6) because decreasing Z (for 
positive values of Z) pushes the objective function line toward the origin (where Z = 
0). Z = 5.25 for (7.5, 4.5) is less than Z = 5.4 for (6, 6). 

But what about Mary? Over a gruelling treatment period of six weeks, the 
medical team implemented this optimal design of using a total dose at the entry point 
of 7.5 kilorads for beam 1 and 4.5 kilorads for beam 2. Mary’s fighting spirit did the 
rest. As of this writing, she is alive and doing well! 


' Actually, Table 3.7 simplifies the real situation, so the real model would be somewhat more complicated 
than this one, and would have dozens of variables and constraints. For details about the general situation, 
see Sonderman, David, and Philip G. Abrahamson: “‘Radiotherapy Treatment Design Using Mathematical 
Programming Models,” Operations Research, 33:705—725, 1985, and its ref. 1. 
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Figure 3.5 Graphical solution for the design of Mary’s radiation therapy. 





Regional Planning 


One of the interesting social experiments in the Mediterranean region is the system 
of kibbutzim, or communal farming communities, in Israel: It is common for groups 
of kibbutzim to join together to share common technical services and to coordinate 
their production. Our next example concerns one such group of three kibbutzim, which 
we call the SOUTHERN CONFEDERATION OF KIBBUTZIM. 

Overall planning for the Southern Confederation of Kibbutzim is done in its 
Coordinating Technical Office. This office currently is planning agricultural produc- 
tion for the coming year. 

The agricultural output of each kibbutz is limited by both the amount of available 
irrigable land and by the quantity of water allocated for irrigation by the Water Com- 
missioner (a national government official). These data are given in Table 3.8. 

` The crops suited for this region include sugar beets, cotton, and sorghum, and 
these are the three being considered. for the upcoming season. These crops differ 


Table 3.8 Resources Data for Southern 
Confederation of Kibbutzim 





Usable Land 
(Acres) 


400 
600 
300 







Water Allocation 
(Acre Feet) 


600 
800 
375 





Kibbutz 
1 












3 





primarily in their expected net return per acre and their consumption of water. In 
addition, the Ministry of Agriculture has set a maximum quota for the total acreage 
that can be devoted to each of these crops by the Southern Confederation of Kibbutzim, 
as shown in Table 3.9. 

The three kibbutzim belonging to the Southern Confederation have agreed that 
every kibbutz will plant the same proportion of its available irrigable land. For ex- 
ample, if kibbutz 1 plants 200 of its available 400 acres, then kibbutz 2 must plant 
300 of its 600 acres, while kibbutz 3 plants 150 acres of its 300 acres. However, any 
combination of the crops may be grown at any of the kibbutzim. The job facing the 
Coordinating Technical Office is to plan how many acres to devote to each crop at 
the respective kibbutzim while satisfying the given restrictions. The objective is to 
maximize the total net return to the Southern Confederation as a whole. 


FORMULATION AS A LINEAR PROGRAMMING PROBLEM: The quantities to be 
decided upon are the number of acres to devote to each of the three crops at each of 
the three kibbutzim. The decision variables, x; (j = 1, 2,..., 9), represent these 
nine quantities, as shown in Table 3.10. Since the measure of effectiveness Z is total 
net return, the resulting linear programming model for this problem is 


Maximize Z = 400, + x, + x3) + 3004 + x; + x + 100(x- + xg + Xo), 
subject to the following constraints: 
1. Usable land for each kibbutz: 
xı + x4 + x, = 400 
Xo + X5 + xy = 600 
X3 + Xp + X= 300 
2. Water allocation for each kibbutz: 


3x, + 2x, + xy = 600 


Table 3.9 Crop Data for Southern Confederation of Kibbutzim 























Maximum Water Net 











Quota Consumption Return 
Crop (Acrés) (Acre Feet/Acre) (Dollars/Acre) 
Sugar beets 600 400 
Cotton 500 300 
Sorghum 325 100 
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Table 3.10 Decision Variables for Southern 
Confederation of Kibbutzim Problem 


Allocation (Acres) 





Kibbutz 
Crop 1 2 3 
Sugar beets xy Xz X3 
Cotton X4 X5 X6 
Sorghum X Xg Xg 





3x, + 2x; + xg = 800 
3x, + 2x6 + Xə = 375 
3. Total acreage for each crop: 
x + x + x, = 600 
X4 + x; + xe = 500 
x + Xg + X= 325 
4. Equal proportion of land planted: 


Xy + x4 + Xp _ X2 t Xs + xg 


400 600 
X, + Xs + Xg _ X3 t Xe t Xo 
600 7 300 
X3 + Xe t Xo _ X4 + x4 t x 
300 H 400 
5. Nonnegativity: 
x; 20, forj = 1,2,...,9. 


This completes the model, except that the equality constraints are not yet in an ap- 
propriate form for a linear programming model because some of the variables are on 
the right-hand side. Hence their final form! is 


3x + x, + x) — 2x + xs +x) = 0 
Xa + xs + xg — 2%, + xe + x) = 0 
A(x; + xe + Xo) — 3(x; + x4 + x7) = 0. 


The Coordinating Technical Office formulated this model and then applied the 
simplex method (developed in the next chapter) to find the best solution. The solution 
they obtained is 


(X14 Xa, X3, Xas Xs, Xoo Xy, Xg, Xo) = (1334, 100, 25, 100, 250, 150, 0, 0, 0), 


as shown in Table 3.11. 


1 Actually, any one of these equations is redundant and can be deleted if desired. Because of these equations, 
any two of the land constraints also could be deleted. 


Table 3.11 Optimal Solution for Southern 
Confederation of Kibbutzim Problem 


Best Allocation (Acres) 





Kibbutz 
Crop 1 2 3 
Sugar beets 1334 100 25 
Cotton 100 250 150 
Sorghum 0 0 0 





Controlling Air Pollution 


The NORI & LEETS CO., one of the major producers of steel in its part of the world, 
is located in the city of Steeltown and is the only large employer there. Steeltown has 
grown and prospered along with the company, which now employs nearly 50,000 
residents. Therefore, the attitude of the townspeople always has been ‘‘What’s good 
for Nori & Leets is good for the town.’? However, this attitude is now changing; 
uncontrolled air pollution from the company’s furnaces is ruining the appearance of 
the city and endangering the health of its residents. 

A recent stockholders’ revolt resulted in the election of a new enlightened board 
of directors for the company. These directors are determined to follow socially re- 
sponsible policies, and they have been discussing with Steeltown city officials and 
citizens’ groups what to do about the air pollution problem. Together they have worked 
out stringent air quality standards for the Steeltown airshed. 

The three main types of pollutants in this airshed are particulate matter, sulfur 
oxides, and hydrocarbons. The new standards require that the company reduce its 
annual emission of these pollutants by the amounts shown in Table 3.12. The board 
of directors has instructed management to have the engineering staff determine how 
to achieve these reductions in the most economical way. 

The steelworks has two primary sources of pollution, namely, the blast furnaces 
for making pig iron and the open-hearth furnaces for changing iron into steel. In both 
cases the engineers have decided that the most effective types of abatement methods 
are (1) increasing the height of the smokestacks,! (2) using filter devices (including 
gas traps) in the smokestacks, and (3) including cleaner, high-grade materials among 


Table 3.12 Clean Air Standards for 
Nori & Leets Co. 


Required Reduction in 
Annual Emission Rate 





Pollutant (Million Pounds) 
Particulates 60 
Sulfur oxides 150 
Hydrocarbons 125 





1 Subsequent to this study, this particular abatement method has become a controversial one. Because its 
effect is to reduce ground-level pollution by spreading emissions over a greater distance, environmental 
groups contend that this creates more acid rain by keeping sulfur oxides in the air longer. Consequently, 
the U.S. Environmental Protection Agency adopted new rules in 1985 to remove incentives for using tall 
smokestacks. 
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Table 3.13 Reduction in Emission Rate from Maximum Feasible Use of Abatement Method 
for Nori & Leets Co. 


































Taller Smokestacks Filters Better Fuels 
Blast Open-hearth Open-hearth Blast Open-hearth 
Pollutant Furnaces Furnaces Furnaces Bs daa Furnaces 
Particulates 12 9 20 17 13 
Sulfur oxides 35 42 31 56 49 
Hydrocarbons 37 53 24 29 20 








the fuels for the furnaces. All these methods have technological limits on how much 
emission they can eliminate from each type of furnace, as shown (in millions of pounds 
per year) in Table 3.13. 

However, the methods can be used at any fraction (including zero) of their 
abatement capacities shown in this table, and the fractions can be different for blast 
furnaces and open-hearth furnaces. For either type of furnace, the emission reduction 
achieved by each method is not: substantially affected by whether or not the other 
methods also are used. 

After these data were developed, it became clear that no single method by itself 
could achieve all the required reductions..On the other hand, combining all three 
methods at full capacity on both types of furnaces (which would be prohibitively 
expensive if the company’s products are to remain competitively priced) is much more 
than adequate. Therefore, the engineers.concluded that they would have to use some 
combination of the methods,. perhaps with fractional capacities, based upon their 
relative costs. Furthermore, because of the differences between the blast and the open- 
hearth furnaces, the two types probably should not use the same combination. 

An analysis was conducted to estimate. the total annual cost that would be 
incurred by each abatement method.. In addition to increased operating and mainte- 
nance expenses, consideration. was given also to the initial costs (converted to an 
equivalent annual basis) of the method as well as any resulting loss in efficiency of 
the production process. This analysis led to the total cost estimates (in millions of 
dollars) given in Table 3.14 for using the methods at their full abatement capacities. 
It also was determined that the cost of a method being used at a lower level is 
essentially proportional to its fractional capacity. Thus, for any given fraction 
used, the total annual cost would be that fraction of the corresponding quantity in 
Table 3.14. 

The stage now was set to develop the general framework of the company’s plan 
for pollution abatement. This plan would consist of specifying which types of abate- 


Table 3.14 Total Annual Cost from Maximum 
Feasible Use of Abatement Method for 
Nori & Leets Co. 











Blast Open-hearth 
Abatement Method Furnaces Furnaces 
Taller smokestacks 8 
Filters T. 6 
Better fuels 11 





Table 3.15 Decision Variables (Fraction of 
Maximum Feasible Use of Abatement Method) 
for Nori & Leets Co. 













Blast 
Furnaces 


Open-hearth 


Abatement Method Furnaces 





Taller smokestacks 
Filters 
Better fuels 


X4 


ment methods would be used and at what fractions of their abatement capacities for 
(1) the blast furnaces and (2) the open-hearth furnaces. Because of the combinatorial 
nature of the problem of finding a plan that satisfies the requirements with the smallest 
possible cost, an operations research team was formed to solve the problem. The 
team adopted a linear programming approach, formulating the model summarized 
next. 


FORMULATION AS A LINEAR PROGRAMMING PROBLEM: This problem has six 
decision variables, x, (j = 1, 2, . . . , 6), each representing the usage of one of the 
three abatement methods for one of the two types of furnaces, expressed as a fraction 
of the abatement capacity. The ordering of these variables is shown in Table 3.15. 
Because the objective is to minimize total cost while satisfying the emission reduction 
requirements, the model is 


Minimize Z = 8x, + 10x, + 7x, + 6x, + lix; + 9X6, 
subject to the following constraints: 
1. Emission reduction: 


12x, + 9x, + 25x, + 20x, + 17x; + 13x = 60 
35x, + 42x, + 18x, + 31x, + 56x; + 49x = 150 


37x, + 53x, + 28x3 + 24x, + 29x; + 20x = 125 
2. Technological limit: 
xs l, forj = 1,2,...,6 
3. Nonnegativity: 
x, = 0, forj = 1,2,...,6. 


The operations research team used this model! to find the minimum-cost plan 
(Xi; X2, X3, X4, Xs, X6) = (1, 0.623, 0.343, 1, 0.048, 1). Sensitivity analysis then was 
conducted, followed by detailed planning and managerial review. Soon after, this 
program for controlling air pollution was fully implemented by the company, and the 
citizens of Steeltown breathed deep sighs of relief. 


1 An equivalent formulation can express each decision variable in natural units for its abatement method; 
for example, x, and x, could represent the number of feet that the heights of the smokestacks are increased. 
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Other Examples 


The four linear programming examples you have seen so far are but a small sampling 
of the uses of this technique. Many more illustrations are given in Chaps. 7 and 8; 
most involve business and industrial applications, but several others arise in different 
contexts. Chapter 7 focuses on certain special types of linear programming problems 
that provide many important applications. Chapter 8 considers some examples that 
are more difficult to formulate, and it also includes a case study involving the design 
of school attendance zones to achieve better racial balance. But before considering 
these topics, we next discuss how to solve linear programming problems. 


3.5 Conclusions 


Linear programming is a powerful technique for dealing with the problem of allocating 
limited resources among competing activities as well as other problems having a 
similar mathematical formulation. It has become a standard tool of great importance 
for numerous business and industrial organizations. Furthermore, almost any social 
organization is concerned with allocating resources in some context, and there is a 
growing recognition of the extremely wide applicability of this technique. 

However, not all problems of allocating limited resources can be formulated to 
fit a linear programming model, even as a reasonable approximation. When one or 
more of the assumptions of linear programming is violated seriously, it may then be 
possible to apply another mathematical programming model instead, e.g., the models 
of integer programming (Chap. 13) or nonlinear programming (Chap. 14). 
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PROBLEMS? 


1.* Suppose you have just inherited $6,000 and you want to invest it. Upon hearing 
this news, two different friends have offered you an opportunity to become a partner in two 
different entrepreneurial ventures, one planned by each friend. In both cases, this investment 
would involve expending some of your time next summer as well as putting up cash. Becoming 
a full partner in the first friend’s venture would require an investment of $5,000 and 400 hours, 
and your estimated profit (ignoring the value of your time) would be $4,500. The corresponding 
figures for the second friend’s venture. are $4,000 and 500 hours, with an estimated profit to 
you of $4,500. However, both friends are flexible and would allow you to come in at any 


Some additional formulation problems are given at the end of Chap: 8. Also note that at least partial 
answers to starred problems are given at the back of the book. 


fraction of a full partnership you would like; your share of the profit would be proportional to 53 
this fraction. 

Because you were looking for an interesting summer job anyway (maximum of 600 
hours), you have decided to participate in one or both friends’ ventures in whichever combi- 
nation would maximize your total estimated profit. You now need to solve the problem of 
finding the best combination. 

(a) Describe the analogy between this problem and the Wyndor Glass Co. problem 
discussed in Sec. 3.1. Then construct and fill in a table like Table 3.2 for this 
problem, identifying both the activities and the resources. 

(b) Formulate the linear programming model for this problem. 

(c) Solve this model graphically. What is your total estimated profit? 

(d) Indicate why each of the four assumptions of linear programming (Sec. 3.3) appears 
to be reasonably satisfied for this problem. Is one assumption more doubtful than 
the others? If so, what should be done to take this into account? 
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2. A manufacturing firm has discontinued the production of a certain unprofitable prod- 
uct line. This act created considerable excess production capacity. Management is considering 
devoting this excess capacity to one or more of three products; call them products 1, 2, and 3. 
The available capacity on the machines that might limit output is summarized in the following 
table: 


Available Time 


Machine Type (in Machine Hours Per Week) 
Milling machine 500 
Lathe 350 
Grinder 150 





The number of machine hours required for each unit of the respective products is 


Productivity Coefficient (in Machine Hours Per Unit) 


















Machine Type Product 1 Product 2 | Product 3 
Milling machine 5 
Lathe 0 


Grinder 


The Sales Department indicates that the sales potential for products 1 and 2 exceeds the 
maximum production rate and that the sales potential for product 3 is 20 units per week. The 
unit profit would be $50, $20, and $25, respectively, on products 1, 2, and 3. The objective 
is to determine how much of each product the firm should produce to maximize profit. 

Formulate the linear programming model for this problem. 


3.* Use the graphical procedure illustrated in Sec. 3.1 to solve the problem: 
Maximize Z = 2x, + Xo, 
subject to x, = 10 


2x, + 5x, = 60 
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xX, + x, = 18 
3x, + x, = 44 


and 420, 20. 


4. Use the graphical procedure illustrated in Sec. 3.1 to solve the problem: 
Maximize Z = 10x, + 20x, 
subject to =x, + 2x, = 15 
xX, + x, = 12 
5x, + 3x, = 45 


and x, = 0, x Z0. 


5. For each of the four assumptions of linear programming discussed in Sec. 3.3, write 
a one-paragraph analysis of how well you feel it applies to each of the following examples 
given in Sec. 3.4: 

(a) Design of radiation therapy (Mary). 

(b) Regional planning (Southern Confederation of Kibbutzim). 

(c) Controlling air pollution (Nori & Leets Co.). 


6. Consider a problem with two decision variables, x, and x}, which represent the levels 
of activities 1 and 2, respectively. For each variable, the permissible values are 0, 1, and 2, 
where the feasible combinations of these values for the two variables are determined from a 
variety of constraints. The objective is to maximize a certain measure of performance denoted 
by Z. The values of Z for the possibly feasible values of (x,, x.) are estimated to be those given 
in the following table: 





x, 0 1 2 
0 0 4 8 
1 3 8 B 
2 6 12 18 





Based on this information, indicate whether this problem completely satisfies each of the 
four assumptions of linear programming. Justify your answers. 


7. Consider the problem described at the beginning of Sec. 8.5, where the city of 
Middletown is using linear programming to redesign the school attendance zones for its high 
schools. The objective function used there is to minimize the total distance that students must 
travel, subject to constraints on the racial balance in the schools. Now suppose that the school 
board decides to change the objective function to minimizing the total cost of bussing the 
students. (However, the decision variables continue to be x, = number of students in tract i 
assigned to school j.) Each student assigned to a school more than a mile away will be given 
the opportunity to ride a bus. (However, some of these students may choose to get to and from 
school in some other way.) Each bus can carry 40 students. The daily cost of providing each 
bus is estimated to be $50 plus $1 for each student carried. A bus may transport students from 
more than one tract to try to fill the buses. 


For each of the four assumptions of linear programming discussed in Sec. 3.3, write a 
one-paragraph analysis of how well it applies to the revised objective function. 


8. Use the graphical procedure illustrated in Sec. 3.1 to solve the problem: 
Minimize Z = 15x, + 20x, 
subject to x, + 2x, = 10 
2x, — 3x, = 6 
xX, + x= 6 
and x, 20, x, 20. 
9. Use the graphical procedure illustrated in Sec. 3.1 and Fig. 3.5 to solve the problem: 
Minimize Z = 3x, + 2x, 
subject to xX, + 2x, = 12 
2x, + 3x, = 12 
2x, + x= 8 


and x, 20, x, = 0. 


10. A farmer is raising pigs for market, and he wishes to determine the quantities of 
the available types of feed that should be given to each pig to meet certain nutritional require- 
ments at a minimum cost. The number of units of each type of basic nutritional ingredient 
contained within a kilogram of each feed type is given in the following table, along with the 
daily nutritional requirements and feed costs: 

















Kilogram | Kilogram | Kilogram Minimum 
Nutritional Daily 
Ingredient Requirement 
Carbohydrates 200 
Protein 180 
Vitamins 150 





Cost (¢) 


Formulate the linear programming model for this problem. 


11. A certain corporation has three branch plants with excess production capacity. All 
three plants have the capability for producing a certain new product, and management has 
decided to use some of the excess capacity in this way. This product can be made in three 
sizes—large, medium, and small—that yield a net unit profit of $420, $360, and $300, re- 
spectively. Plants 1, 2, and 3 have the excess capacity to produce 750, 900, and 450 units per 
day of this product, respectively, regardless of the size or combination of sizes involved. 

The amount of available in-process storage space also imposes a limitation on the pro- 
duction rates of the new product. Plants 1, 2, and 3 have 13,000, 12,000, and 5,000 square 
feet of in-process storage space available for a day’s production of this product. Each unit of 
the large, medium, and small sizes produced per day requires 20, 15, and 12 square feet, 
respectively. 
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Sales forecasts indicate that 900, 1,200, and 750 units of the large, medium, and small 
sizes, respectively, can be sold per day. 

To maintain a uniform workload among the plants and to retain some flexibility, man- 
agement has decided that the plants must use the same percentage of their excess capacity to 
produce the new product. 

Management wishes to know how much of each of the sizes should be produced by each 
of the plants to maximize profit. 

Formulate the linear programming model for this problem. 


12. A farm family owns 125 acres of land and has $40,000 in funds available for 
investment. Its members can produce a total of 3,500 person-hours worth of labor during the 
winter months (mid-September to mid-May). and 4,000 person-hours during the summer. If any 
of these person-hours are not needed, younger members of the family will use them to work 
on a neighboring farm for $5/hour during the winter months and $6/hour during the summer. 

Cash income may be obtained from three crops and two types of livestock: dairy cows 
and laying hens. No investment funds are needed for the crops. However, each cow will require 
an investment outlay of $1,200, and each hen will cost $9. 

Each cow will require 1.5 acres of land, 100 person-hours of work during the winter 
months, and another 50 person-hours during the summer. Each cow will produce a net annual 
cash income of $1,000 for the family. The corresponding figures for each hen are: no acreage, 
0.6 person-hour during the winter, 0.3 more person-hour during the summer, and an annual 
net cash income of $5. The chicken house can accommodate a maximum of 3,000 hens, and 
the size of the barn limits the herd to a maximum of 32 cows. 

Estimated person-hours and income per acre planted in each of the three crops are 





20 35 
50 75 
600 900 


The family wishes to determine how much acreage should be planted in each of the crops 
and how many cows and hens should be kept to maximize its net cash income. Formulate the 
linear programming model for this problem. 






Winter person-hours 
Summer person-hours 
Net annual cash income ($) 







13. A cargo plane has three compartments for storing cargo: front, center, and back. 
These compartments have capacity limits on both weight and space, as summarized below: 














Weight 
Capacity 
(Tons) 


Space 
Capacity 
(Cubic Feet) 


Furthermore, the weight of the cargo in the respective compartments must be the same pro- 
portion of that compartment’s weight capacity to maintain the balance of the airplane. 


The following four cargoes have been offered for shipment on an upcoming flight as 57 
space is available: 
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Volume Profit 








Cargo (Cubic Feet/Ton) | ($/Ton) 
1 500 320 
2 700 400 
3 600 360 
4 400 


290 


Any portion of these cargoes can be accepted. The objective is to determine how much (if any) 
of each cargo should be accepted and how to distribute each among the compartments to | 
maximize the total profit for the flight. 


Formulate the linear programming model for this problem. 


14. An investor has money-making activities A and B available at the beginning of each 
of the next five years (call them years 1 to 5). Each dollar invested in A at the beginning of a 
year returns $1.40 (a profit of $0.40) two years later (in time for immediate reinvestment). 
Each dollar invested in B at the beginning of a year returns $1.70 three years later. 
In addition, money-making activities C and D will each be available at one time in the 
~future. Each dollar invested in C at the beginning of year 2 returns $1.90 at the end of year 5. 
Each dollar invested in D at the beginning of year 5 returns $1.30 at the end of year 5. 
The investor begins with $60,000 and wishes to know which investment plan maximizes 
the amount of money that can be accumulated by the beginning of year 6. 
Formulate the linear programming model for this problem. 


Solving Linear 
Programming Problems: 
The Simplex Method 


We now are ready to begin studying the simplex method, the general procedure for 
solving linear programming problems. Developed by George Dantzig in 1947, it has 
proven to be a remarkably efficient method that is used routinely to solve huge prob- 
lems on today’s computers. Except for tiny problems. this method is always executed 
on a computer. and sophisticated software packages are widely available. Neverthe- 
less. it is important to learn something about how the method works in order to 
understand how to perform post-optimality analysis (including sensitivity analysis) on 
the model. Therefore, this chapter describes and illustrates the main features of the 
simplex method. 

The first section introduces the general nature of the simplex method. including 
its geometric interpretation. The following three sections then develop the procedure 
for solving any linear programming model that is in our standard form (as detined in 
Sec. 3.2) and has only positive right-hand sides (b,) in the functional constraints. 
Certain details on resolving ties are deferred to Sec. 4.5, and Sec. 4.6 describes how 
to adapt this method to other model forms. We next discuss post-optimality analysis 
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Problems: 

4.1 The Essence of the Simplex Method se Single e 





The simplex method actually is an algorithm, the first of many you will see in this 
book. Although you may not have heard this name used, you undoubtedly have 
encountered many algorithms before. For example, the familiar procedure for long 
division is an algorithm. So is the procedure for calculating square roots. In fact, any 
iterative solution procedure is an algorithm. Thus an algorithm is simply a process 
where a systematic procedure is repeated (iterated) over and over again until the 
desired result is obtained. Each time through the systematic procedure is called an 
iteration. (Can you see what the iteration is for the long division algorithm?) Con- 
sequently, an algorithm replaces one difficult problem by a series of easy ones. 

In addition to iterations, algorithms also include a procedure for getting started 
and a criterion for determining when to stop, as summarized here: 


Structure of Algorithms!: 
Initialization step Set up to start iterations. 


Iterative step Perform an iteration. 






Stopping rule Has desired result been obtained? 


If noj | If yes 
ees cea Stop. 


For most operations research algorithms, including the simplex method, the 
desired result mentioned in the stopping rule is that the current solution is optimal. 
In this case, the stopping rule actually is an optimality test, as shown here: 


Structure of Most Operations Research Algorithms 
Initialization step Set up to start iterations. 


Iterative step Perform an iteration. 






Optimality test Is the current solution optimal? 
If noj |If yes 
bsan Stop. 


The simplex method is an algebraic procedure, where each iteration involves 
solving a system of equations to obtain a new trial solution for the optimality test. 
However. it also has a very useful geometric interpretation. To illustrate the general 
geometric concepts, we shall use the graphical solution to the Wyndor Glass Co. 
example presented in Sec. 3.1. 

To refresh your memory, the graph for this example is repeated in Fig. 4.1. 
The five constraint lines and their points of intersection are highlighted in this figure 
because they are the keys to the analysis. In particular, these points of intersection 
are the corner-point solutions of the problem. The five that lie on the corners of the 
feasible region—(0, 0), (0, 6), (2, 6), (4, 3), (4, 0)—are the corner-point feasible 


' Actually, the stopping rule usually is applied after the initialization step as well to see if any iterations 
are needed. k 


60 


Linear Programming 


Figure 4.1 
Constraint lines and 
comer-point solutions 
for the Wyndor Glass 
Co. problem. 


(0, 9) 





(0, 6) 





(0, 0) 


(4, 0) (6, 0) 


solutions. [The other three—(0, 9), (4, 6), (6, 0)—are called corner-point infeasible 
solutions.| Some of these corner-point feasible solutions are adjacent to each other 
in the sense that they are connected by a single edge (line segment) on the boundary 
of the feasible region; e.g., both (0, 6) and (4, 3) are adjacent to (2, 6). 

Section 5.1 develops in detail the general properties of corner-point feasible 
solutions for linear programming problems of any size, as well as the relationships 
between these properties and the algebra of the simplex method presented in the next 
two sections. The three key properties! that form the foundation of the simplex method 
are summarized as follows. 


Properties of Corner-Point Feasible Solutions 
la. If there is exactly one optimal solution, then it must be a comer-point 
feasible solution. 
1b. If there are multiple optimal solutions, then at least two must be adjacent 
corner-point feasible solutions. 
2. There are only a finite number of corner-point feasible solutions. 
3. If a corner-point feasible solution has no adjacent corner-point feasible 
solutions that are better (as measured by Z), then there are no better corner- 
point feasible solutions anywhere; i.e., it is optimal. 


Property 1 implies that the search for an optimal solution can be reduced to considering 
only the corner-point feasible solutions, so there are only a finite number of solutions 
to consider (Property 2). Property 3 provides a very convenient optimality test. 

The simplex method exploits these three properties by examining only relatively 
few of the promising corner-point feasible solutions and stopping as soon as one of 


' To ensure that the problem does, in fact, possess the optimal solution discussed in Properties 1 and 3, it 
is sufficient to assume that (1) the problem has feasible solutions and (2) the problem has a bounded feasible 
region. Assumption (2) also is needed to ensure that Property 1b holds. 


them passes this optimality test. In particular, it repeatedly (iteratively) moves from 
the current corner-point feasible solution to a better adjacent corner-point feasible 
solution (which can be done very efficiently), until the current solution does not have 
any better adjacent corner-point feasible solutions. This procedure is summarized as 
follows. 


Outline of the Simplex Method 

1. Initialization step: Start at a corner-point feasible solution. 

2. Iterative step: Move to a better adjacent corner-point feasible solution. (Re- 
peat this step as often as needed.) 

3. Optimality test: The current corner-point feasible solution is optimal when 
none of its adjacent corner-point feasible solutions are better. 


This outline shows the essence of the simplex method, although the complete 
description in the next two sections does specify a convenient way of choosing the 
new solution in both the initialization and iterative steps. Using these choice rules, 
the simplex method proceeds as follows in the Wyndor Glass Co. example. 


1. Initialization step: Start at (0, 0). 
2a. Iteration 1: Move from (0, 0) to (0, 6). 
2b. Iteration 2: Move from (0, 6) to (2, 6). 
3. Optimality test: Neither (0, 6) nor (4, 3) is better than (2, 6), so stop. (2, 6) 
is optimal. 


4.2 Setting Up the Simplex Method 


The preceding section stressed the geometric concepts that underlie the simplex 
method. However, this algorithm normally is run on a computer, which can follow 
only algebraic instructions. Therefore, it is necessary to translate the conceptually 
geometric procedure just described into a usable algebraic procedure. In this section, 
we introduce the algebraic language of the simplex method and relate it to the concepts 
of the preceding section. 

In an algebraic procedure, it is much more convenient to deal with equations 
than with inequality relationships. Therefore, the first step in setting up the simplex 
method is to convert the functional inequality constraints into equivalent equality 
constraints. (The nonnegativity constraints can be left as inequalities because they are 
used only indirectly by the algorithm.) This conversion is done by introducing slack 
variables. To illustrate, consider the first functional constraint in the Wyndor Glass 
Co. example of Sec. 3.1, 

x, = 4. 
The slack variable for this constraint is 
X% = 4- x, 
which is just the slack between the two sides of the inequality. Thus 
xX, + x, = 4. 


The original constraint x, = 4 holds whenever x; = 0. Hence x, = 4 is entirely 
equivalent to the set of constraints 


xX) +x, = 4 


61 


Solving Linear 
Programming 
Problems: 

The Simplex Method 


62 


Linear Programming 


and x, 2 0, 


so these more convenient constraints are used instead. 

By introducing slack variables in an identical fashion for the other functional 
constraints, the original linear programming model for the example can now be re- 
placed by the equivalent model: 


Maximize Z = 3x, + 5x, 
subject to 
(1) xy + x3 = 4 
(2) 2X ae a = 12 
(3) 3x, + 2x, + x5 = 18 
and x; 20, forj = 1,2,...,5. 


Although this problem is identical to the original, this form is much more convenient 
for algebraic manipulation and for identification of corner-point feasible solutions. 
We call this the augmented form of the problem, because the original form has been 
augmented by some additional variables needed to apply the simplex method. 

The terminology used in the preceding section (corner-point solutions, etc.) 
applies to the original form of the problem. We now introduce the corresponding 
terminology for the augmented form. 


An augmented solution is a solution for the original variables that has been augmented 
by the corresponding values of the slack variables. 


For example, augmenting the solution (3, 2) in the example yields the augmented 
solution (3, 2, 1, 8, 5) because the corresponding values of the slack variables are 
x3 = 1, x, = 8, x; = 5. 


A basic solution is an augmented corner-point solution.' 


To illustrate, consider the corner-point infeasible solution (4, 6) in the example. Aug- 
menting it with the resulting values of the slack variables x; = 0, x, = 0, and x; = 
—6 yields the corresponding basic solution (4, 6, 0, 0, —6). 

The fact that corner-point solutions (and so basic solutions) can be either feasible 
or infeasible implies the following definition: 


A basic feasible solution is an augmented corner-point feasible solution. 


Thus the corner-point feasible solution (0, 6) in the example is equivalent to the basic 
feasible solution (0, 6, 4, 0, 6) for the problem in augmented form. 

The only difference between basic solutions and corner-point solutions (or be- 
tween basic feasible solutions and comner-point feasible solutions) is whether or not 
the values of the slack variables are included. For any basic solution, the corresponding 
corner-point solution is obtained simply by deleting the slack variables. Therefore, 
the geometric and algebraic relationships between these two solutions are very close, 
as described in Sec. 5.1. 

Because the terms basic solution and basic feasible solution are very important 


1 When the original problem includes equality constraints, the basic solutions are just the augmented corner- 
point solutions that satisfy all these constraints. 


parts of the standard vocabulary of linear programming, we now need to clarify their 
algebraic properties. For the augmented form of the example, notice that the system 
of functional constraints has two more variables (5) than equations (3). This fact gives 
us two degrees of freedom in solving the system, since any two variables can be 
chosen to be set equal to any arbitrary value in order to solve the three equations in 
terms of the remaining three variables (barring redundancies). The simplex method 
uses zero for this arbitrary value. The variables that are currently set to zero by the 
simplex method are called nonbasic variables, and the others are called basic vari- 
ables. The resulting solution is called a basic solution. If all of the basic variables 
are nonnegative, the solution is called a basic feasible solution. 

To illustrate these definitions, consider again the basic feasible solution 
(0, 6, 4, 0, 6). This solution was obtained before by augmenting the cormer-point 
feasible solution (0, 6). However, another way to obtain this same solution is to choose 
x, and x, to be the two nonbasic variables, and so the two variables to be set equal 
to zero. The three equations then yield, respectively, x, = 4, x, = 6, and x; = 6 as 
the solution for the three basic variables. Because all three of these basic variables 
are nonnegative, this basic solution (0, 6, 4, 0, 6) is indeed a basic feasible solution. 

Two basic feasible solutions are adjacent if all but one of their nonbasic vari- 
ables are the same (so the same statement holds for their basic variables). Conse- 
quently, moving from the current basic feasible solution to an adjacent one involves 
switching one variable from nonbasic to basic and vice versa for one other variable. 

To illustrate adjacent basic feasible solutions, consider one pair of adjacent 
corner-point feasible solutions in Fig. 4.1, (0, 0) and (0, 6). Their augmented solu- 
tions, (0, 0, 4, 12, 18) and (0, 6, 4, 0, 6), automatically are adjacent basic feasible 
solutions. However, you don’t need to look at Fig. 4.1 to draw this conclusion. 
Another signpost is that their nonbasic variables, (x,, x») and (x,, x4), are the same 
with just the one exception that x, has been replaced by x,. Consequently, moving 
from (0, 0, 4, 12, 18) to (0, 6, 4, 0, 6) involves switching x, from nonbasic to basic 
and vice versa for x4. 

In general terms, the number of nonbasic variables in a basic solution always 
equals the number of degrees of freedom in the system of equations, and the number 
of basic variables always equals the number of functional constraints. 

When dealing with the problem in augmented form, it is convenient to consider 
and manipulate the objective function equation at the same time as the new constraint 
equations. Therefore, before starting the simplex method, the problem needs to be 
rewritten once again in an equivalent way as: 


Maximize Z 


subject to 

(0) Z ~ 3x, — 5x, = 0 
qd) xy + Xs; = 4 
(2) 2x3 Fa = 12 
(3) 3x, + 2x, +x; = 18 
and x; 20, forj = 1,2,...,5. 


It is just as if Eq. (0) actually were one of the original constraints, but because it 
already is in equality form, no slack variable is needed. With this interpretation, the 
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basic solutions would be unchanged, except that Z would be viewed as a permanent 
additional basic variable. 

Somewhat fortuitously, the model for the Wyndor Glass Co. problem fits our 
standard form, and all its functional. constraints have positive right-hand sides (b;). 
If this had not been the case, then additional adjustments would have been needed at 
this point before applying the simplex method. These details are deferred to Sec. 4.6, 
and we now turn to focusing on the simplex method itself. 


4.3 The Algebra of the Simplex Method 


The discussion in Sec. 4.1 of the essence of the simplex method did not get into the 
details of how the steps are performed. In particular, the following questions have not 
yet been answered completely (the parenthetical phrases restate the questions in the 
algebraic terminology of Sec. 4.2). 


1. Initialization step: How is the initial corner-point feasible solution (basic 
feasible solution) selected? 

2. Iterative step: When seeking to move to a better adjacent corner-point fea- 
sible solution (adjacent basic feasible solution), 
(a) How is the direction of movement selected? (Which nonbasic variable 

is selected to become basic?) 

(b) Where do we stop? (Which basic variable becomes nonbasic?) 
(c) How is the new solution identified? 

3. Optimality test: How do we determine that the current corner-point feasible 
solution (basic feasible solution) has no adjacent corner-point feasible so- 
lutions (adjacent basic feasible solutions) that are better? 


These are the questions addressed in this section. We continue to use the prototype 
example of Sec. 3.1, as rewritten at the end of Sec. 4.2, for illustrative purposes. 


Initialization Step | 


The simplex method can start at any. corner-point feasible solution (basic feasible 
solution), so it chooses a convenient one. Before considering slack variables, this 
choice is the origin (all original variables equal to zero), or (x,, x2) = (0, 0) in the 
example.! Consequently, after the introduction of slack variables, the original vari- 
ables are the nonbasic variables and the slack variables are the basic variables for the 
initial basic feasible solution. This choice is illustrated here, where the basic variables 
are shown in bold type. 


(1) x +x = 4 
(2) 2x, +X, = 12 
(3) 3x, + 2x, + x; = 18 


1 Note that choosing the origin makes the left-hand side of all the original functional constraints equal to 
zero. Therefore, under the current assumptions about the form of the model, including = functional 
constraints and positive right-hand sides, this cormer-point solution automatically is feasible. 


Because the nonbasic variables are set equal to zero, the remaining solution is read 
as if the nonbasic variables were not there, so x, = 4, x, = 12, and x; = 18, giving 
the initial basic feasible solution (0, 0, 4, 12, 18). 

Notice that the reason this solution can be read immediately is that each equation 
has just one basic variable, which has a coefficient of 1, and this basic variable does 
not appear in any other equation. You will soon see that when the set of basic variables 
changes, the simplex method uses an algebraic procedure (Gaussian elimination) to 
convert the equations into this same convenient form for reading every subsequent 
basic feasible solution as well. This form is called proper form from Gaussian 
elimination. 


Iterative Step 


At each iteration, the simplex method moves from the current basic feasible solution 
to a better adjacent basic feasible solution. This movement involves converting one 
nonbasic variable into a basic variable (called the entering basic variable) and si- 
multaneously converting a basic variable into a nonbasic variable (called the leaving 
basic variable), and then identifying the new basic feasible solution. 


QUESTION 1: What is the criterion for selecting the entering basic variable? 


The candidates for the entering basic variable are the n current nonbasic variables. 
The one chosen would be changed from a nonbasic to a basic variable, so its value 
would be increased from zero to some positive number and the others would be kept 
at zero. Since the new basic feasible solution is required to be an improvement (larger 
Z) over the current one, the rate of change in Z from increasing the entering basic 
variable must be a positive one. Using the current Eq. (0) to express Z just in terms 
of the nonbasic variables, the coefficient of each one is the rate at which Z would 
change as that variable is increased. The one that has the largest positive coefficient, 
and so would increase Z at the fastest rate, is chosen to be the entering basic variable. 
To illustrate, the two candidates for the entering basic variable in the example 
are the current nonbasic variables x, and x,. Since the objective function already is 
written only in terms of these nonbasic variables, it can be considered just as is: 


Z= 3x, + 5X. 


Both variables have positive coefficients, so increasing either one would increase Z, 
but at the different rates of 3 and 5 per unit increase in the variable. Since 3 < 5, the 
choice, for the entering basic variable is x,. Therefore, x, will be increased from zero 
while x, remains zero. 


QUESTION 2: How is the leaving basic variable identified? 


Ignoring slack variables, increasing x, from zero while keeping x, zero means that 
we are moving up the x, axis in Fig. 4.1. The adjacent corner-point feasible solution 
(0, 6) is reached by stopping at the first new constraint line (2x, = 12). We must 


| Note that this criterion does not guarantee selecting the variable that would increase Z the most because 
the constraints may not allow increasing this variable as much as some of the others. However, the extra 
computations required to check this are not considered worthwhile. 
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stop there even though there is a corner-point solution at (0, 9), because going further 
would give infeasible solutions that violate the. 2x, = 12 constraint. 

For the problem in augmented form, feasible solutions must satisfy Doth the 
system of functional constraint equations and the nonnegativity constraints on all the 
variables (original variables and slack variables). Increasing x, from zero while keep- 
ing x, zero (nonbasic) means that some or all of the current basic variables (x3, x4, Xs) 
must change their values to keep the system of equations satisfied. Some of these 
variables will decrease as x, increases. The adjacent basic feasible solution is reached 
when the first of the basic variables (the leaving basic variable) reaches a value of 
zero. We must stop there to avoid going infeasible. Thus, when we have chosen the 
entering basic variable, the leaving basic variable is not a matter of choice. It must 
be the current basic variable whose nonnegativity constraint imposes the smallest upper 
bound on how much the entering basic variable can be increased, as illustrated next. 

The possibilities for the leaving basic variable in the example are the current 
basic variables x3, x4, and xs. The calculations for identifying which one must be the 
leaving basic variable are summarized in Table 4.1. Since x, is a nonbasic variable, 
x, = O in the second column of Table 4.1. The third column then indicates that 
(1) x, remains nonnegative (= 4) regardless of how much x, is increased; (2) x, = 
0 when x, = 6 (whereas x, > 0 when x, < 6 and x, < 0 when x, > 6); and 
(3) x5 = 0 when x, = 9 (whereas. x, > 0 when x, < 9 and x; < 0 when x, > 9). 
Thus, the numbers calculated in the third column are the upper bound for x, before 
the corresponding basic variable. in the first column would become negative. Since x, 
(the slack variable for the 2x, = 12 constraint) imposes the smallest upper bound on 
X, the leaving basic variable is x4, so x, = 0 (nonbasic) and x, = 6 (basic) in the 
new basic feasible solution. 


QUESTION 3: How can the new basic feasible solution be identified most conven- 
iently? 


After identifying the entering and leaving basic variables (including the new value of 
the entering basic variable), all that needs to be done to identify the new basic feasible 
solution is to solve for the new values of the remaining basic variables. This solution 
could be obtained directly from Table 4.1.. However, in order to get set up for the 
next iteration, the simplex method converts the system of equations into the same 
convenient proper form from Gaussian elimination as we had in the initialization step 
(namely, each equation has just one basic variable, which has a coefficient of 1, and 
this basic variable does not appear in any other equation). This conversion can be 
done by performing the following two kinds of algebraic operations: 


Algebraic Operations for Solving a System of Linear Equations 
1. Multiplying (or dividing) an equation by a nonzero constant. 
2. Adding (or subtracting) a multiple of one equation to another equation. 


Table 4.1 Calculations for Determining First Leaving Basic 
Variable for Wyndor Glass Co. Problem 


x3 = 4- x, 
X4 = 12 — 2x, 
xs = 18 — 3x, — 2x, 
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These operations are legitimate because they involve only (1) multiplying equals (both 
sides of an equation) by the same constant and (2) adding equals to equals. Therefore, 
a solution will satisfy the system of equations after such operations if and only if it 
did so before. 

To illustrate, consider the original set of equations, where the new basic variables 
are shown in bold type (with Z playing the role of the basic variable in the objective 
function equation): 


(0) Z — 3x, — 5x, = 0 
(1) xy + X3 = 4 
(2) 2x3 + X4 = 12 
(3) 3x, + 2x, + x; = 18. 


Thus, x, has replaced x, as the basic variable in Eq. (2). We now need to solve this 
system of equations for Z, x2, X3, and xs. Since x, has a coefficient of 2 in Eq. (2), 
this equation would be divided by 2 to give its new basic variable a coefficient of 1. 
(This is an example of algebraic operation 1.) 


(2) eee ee 


Next, x, must be eliminated from the other equations in which it appears. Using 
algebraic operation 2, this elimination is done as follows: 


New Eq. (0) = old Eq. (0) + [5 x new Eq. (2)], 








so 
Z — 3x, — 5x, = 0 
+ ( 5x, + 3x, = 30) 
(0) Z — 3x, + 3x, = 30. 
Now 
New Eq. (3) = old Eq. (3) — [2 x new Eq. (2)], 
so 
3x, + 2x, + x; = 18 
=a 2x, + x4 = 12) 
(3) 3x, — x4 tx; = 6. 


The resulting complete new set of equations is 


(0) Z = 3x, + Bx, = 30 
(1) xy + X3 = 4 
(2) x2 + 3X4 = 6 
(3) 3x; — x4 tx; = 6. 


For purposes of illustration, exchange the locations of x, and x4. 
(0) Z — 3x, + 3x, = 30 


(1) x, + X3 = 4 
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(2) $x, + = 6 
(3) 3x; — x4 +x; = 6. 


Now compare this last set of equations with the initial set obtained under the 
initialization step, and notice that it is indeed in the same convenient proper form 
from Gaussian elimination for immediately reading the current basic feasible solution 
after noting that the nonbasic variables (x, and x,) equal zero. Thus we now have our 
new basic feasible solution, (x1, X2, X3, X4, X5) = (0, 6,4, 0, 6), which yields 
Z = 30. 

To place this algebraic procedure into broader perspective, we have just solved 
the original set of equations to obtain the general solution for Z, x», x3, and x5 in 
terms of x, and x4. (This general solution can be expressed explicitly by moving x, 
and x, to the right-hand side of the new set of equations, but we won’t bother to do 
this.) We then obtained a specific solution (the basic feasible solution) by setting x, 
and x, (the nonbasic variables) equal to zero. This procedure for obtaining the si- 
multaneous solution of a system of linear equations is called the Gauss-Jordan method 
of elimination, or Gaussian elimination for short.! The key concept for this method 
is using the two kinds of algebraic operations to reduce the original system of equations 
to proper form from Gaussian elimination, where each basic variable has been 
eliminated from all but one equation (its equation) and has a coefficient of + 1 in that 
equation. Once proper form from Gaussian elimination has been obtained, the solution 
for the basic variables can be read directly from the right-hand side of the equations. 

If Gaussian elimination is not yet clear, you can find a more detailed description 
in Appendix 4. 


Optimality Test 


To determine whether the current basic feasible solution is optimal, the current 
Eq. (0) is used to rewrite the objective function just in terms of the current nonbasic 
variables, 


Z = 30 + 3x, — $x, 


Increasing either of these nonbasic variables from zero (while adjusting the values of 
the basic variables to continue satisfying the system of equations) would result in 
moving toward one of the two adjacent basic feasible solutions. Because x, has a 
positive coefficient, increasing x, would lead toward an adjacent basic feasible solution 
that is better than the current basic feasible solution, so the current solution is not 
optimal. 

In general terms, the current basic feasible solution is optimal if and only if all 
of the nonbasic variables have nonpositive coefficients (< 0) in the current form of 
the objective function. This current form is obtained by bringing the x, variables over 
to the right-hand side of the current Eq. (0) after all of the equations have been 
converted to proper form from Gaussian elimination [which eliminates basic variables 
from Eq. (0)]. Equivalently, the variables can be left on the left-hand side, in which 
case the optimality test is whether all of the nonbasic variables have nonnegative 
coefficients ( 0) in the current Eq. (0). 


1 Actually, there are some technical differences between the Gauss-Jordan method of elimination and 
Gaussian elimination, but we will not make this distinction. 


The reason that the current form of the objective function is used for the opti- 
mality test instead of the original objective function is that the current form contains 
all of the nonbasic variables and none of the basic variables. All of the nonbasic 
variables are needed in order to be able to compare all of the adjacent basic feasible 
solutions with the current solution. The basic variables must not appear because their 
values may change when a nonbasic variable is increased from zero, in which case 
the coefficient of the nonbasic variable no longer indicates the rate of change of Z. 
Because of the constraint equations, the two forms of the objective function are 
equivalent, so the one that contains all the desired information is used. 

The reason that Eq. (0) originally was appended to the system of constraint 
equations and then included in the process of Gaussian elimination was just so this 
new, more convenient form of the objective function could be obtained. 

Before proceeding with the next iteration, it is now possible to give a meaningful 
summary of the simplex method. (Tie-breaking considerations are deferred to Sec. 
4.5.) 


Summary of the Simplex Method 


1. INITIALIZATION STEP: Introduce slack variables. If the model is not in the form 
being assumed in this section, see Sec. 4.6 for the necessary adjustments. Otherwise, 
select the original variables to be the nonbasic variables (and thus equal to zero) and 
the slack variables to be the basic variables (and thus equal to the right-hand side) in 
the initial basic feasible solution. Go to the optimality test. 


2. ITERATIVE STEP 

Part 1: Determine the entering basic variable: Select the nonbasic variable that, when 
increased, would increase Z at the fastest rate. Do this by using the current Eq. (0) 
to express Z just in terms of the nonbasic variables and then selecting the nonbasic 
variable with the largest positive coefficient.! 


Part 2: Determine the leaving basic variable: Select the basic variable that reaches 
zero first as the entering basic variable is increased. Each basic variable appears in 
just its equation, so this equation is used to determine when this basic variable reaches 
zero as the entering basic variable is increased. A formal algebraic procedure for doing 
this is to let e denote the subscript of the entering basic variable, let a;, denote its 
current coefficient in Eq. (i), and let b; denote the current right-hand side for this 
equation (i = 1,2, ..., m). Then the upper bound for x, in Eq. (i) is 


+ 0, if a}, = 0 

x= bt 
i 3 

ae if a}, > 0, 


ie 


a 


where this equation’s basic variable reaches zero at this upper bound. Therefore, 
determine the equation with the smallest such upper bound, and select the basic 
variable in that equation as the leaving basic variable. 


! Equivalently, the current Eq. (0) can be used directly, in which case the nonbasic variable with the largest 
negative coefficient would be selected. This is what is done in the tabular form of the simplex method 
presented in Sec. 4.4. 
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Part 3: Determine the new basic- feasible solution: Starting from the current set of 
equations, solve for the basic variables and Z in terms of the nonbasic variables by 
Gaussian elimination (see Appendix 4). Set the nonbasic variables equal to zero; each 
basic variable (and Z) equals the new right-hand side of the one equation in which it 
appears (with a coefficient of 1). 


3. OPTIMALITY TesT: Determine whether this solution is optimal: Check if Z can 
be increased by increasing any nonbasic variable. This determination can be made by 
rewriting the objective function just in terms of the nonbasic variables by bringing 
these variables to the right-hand side in the current Eq. (0) and then checking the sign 
of the coefficient of each nonbasic variable. If all these coefficients are nonpositive, 
then this solution is optimal, so stop.! Otherwise, go to the iterative step. 

To illustrate, let us apply this summary to the next iteration for the example. 


ITERATION 2 FOR EXAMPLE 

Part 1: Because the current Eq. (0) yields Z = 30 + 3x, — 3x,, increasing only x, 
would increase Z; that is, x, has the largest (and only) positive coefficient. Therefore, 
x, is chosen as the new entering basic variable. 


Part 2: The upper bounds on x, before the basic variable in the respective equations 
would become negative are shown in Table 4.2. Because x; gives the smallest upper 
bound for x,, x; must be chosen as the leaving basic variable. 


Part 3: After eliminating x, from all equations in the current set except Eq. (3), where 
x, replaces x; as the basic variable, the new set of equations in proper form from 
Gaussian elimination is: 


(0) Z + 3x, + x; = 36 
(1) xX, + 3X4 _ 3X5 = 2 
(2) Xo + BX, = 6 
(3) X — $x, + ix; = 2. 


Therefore, the next basic feasible solution is (x1, X2, X3, X4, X5) = (2, 6, 2, 0, 0), 
yielding Z = 36. 


Table 4.2 Calculations for Determining Second 
Leaving Basic Variable for Wyndor Glass Co. 
Problem 












Basic 
Variable 


Upper Bound 
for x, 






„siįi= 
No limit 
x, = $ = 2 < minimum 





Equation 
Number 
1 
2 
3 


Equivalently, the current Eq. (0) can be used directly, in which case all these coefficients have to be 
nonnegative (= 0) for the solution to be optimal. This is what is done in the tabular form of the simplex 
method presented in Sec. 4.4. 


OPTIMALITY Test: Because the new form of the objective function is Z = 36 — 
3x, — xs, so that the coefficient of neither nonbasic variable is positive, the current 
basic feasible solution just obtained must be optimal. Therefore, the desired solution 
to the original form of the problem is x, = 2, x, = 6, which yields Z = 36. 


Continuing the Learning Process with Your OR COURSEWARE 


This is the first of many points in the book where you may find it helpful to use your 
OR COURSEWARE (the diskettes packaged in the back of the book). This software 
includes a complete demonstration example of the simplex method in the algebraic 
form just presented. This vivid demonstration simultaneously displays both the algebra 
and the geometry of the simplex method as it dynamically evolves step by step. Like 
the many other demonstration examples accompanying other sections of the book 
(including the next section), this computer demonstration highlights concepts that are 
difficult to convey on the printed page. 

Another feature of your OR COURSEWARE is a collection of routines for 
interactively executing the various algorithms presented throughout the book. One 
such routine is for the algebraic form of the simplex method. Like the others, this 
routine performs nearly all of the calculations while you make the decisions step by 
step, thereby enabling you to focus on concepts rather than getting bogged down in 
a lot of number crunching. Therefore, you probably will want to use this routine for 
your homework on this section. The software will help you get started by letting you 
know whenever you make a mistake on the first iteration of a problem. Follow the 
instructions, and then use the HELP command whenever you are unclear on which 
computer operation should be done next. You can return to the demonstration example 
whenever you need to review what to do on the next step of the simplex method, and 
then come back when ready to where you were in the problem. When you finish the 
problem, you can print out everything you have done for your homework by choosing 
the print command under the FILE menu. 

Your OR COURSEWARE begins with a more complete introduction on how 
to use the software. 


4.4 The Simplex Method in Tabular Form 


The algebraic form of the simplex method presented in Sec. 4.3 may be the best one 
for learning the underlying logic of the algorithm. However, it is not the most con- 
venient form for performing the required calculations. When you need to solve a 
problem by hand (or interactively with your OR COURSEWARE), we recommend 
the tabular form described in this section.! 

The tabular form of the simplex method is mathematically equivalent to the 
algebraic form. However, instead of writing down each set of equations in full detail, 
we use a Simplex tableau to record only the essential information, namely, (1) the 
coefficients of the variables, (2) the constants on the right-hand side of the equations, 
and (3) the basic variable appearing in each equation. This saves writing the symbols 
for the variables in each of the equations, but what is even more important is the fact 


1 A form more convenient for automatic execution on a computer is presented in Sec. 5.2. 
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that it permits highlighting the numbers involved in arithmetic calculations and re- 
cording the computations compactly. 

To introduce the tabular form, we consider the augmented form of the Wyndor 
Glass Co. problem as presented at the end of Sec. 4.2. This system of equations 
[(0 to 3)] can be expressed as shown in Table 4.3. This table shows the layout for 
any simplex tableau, where the column on the left indicates which basic variable 
appears in each equation for the current basic feasible solution. [Although only the x; 
variables are basic or nonbasic, Z plays the role of the. basic variable for Eq. (0).] 
For example, the Basic Variable column of Table 4.3 indicates that the initial basic 
feasible solution has basic variables x3, x4, x5, so the nonbasic variables are the ones 
not listed, x, and x,. After setting x, = 0, x, = 0, the Right Side column gives the 
resulting solution for basic variables, so that the initial basic feasible solution is 
(xis X2; X3, X4, X5) = (0, 0, 4, 12, 18) with Z = 0. 

The reason why the Right. Side column always gives the values of the basic 
variables in the current basic feasible solution is that the simplex method requires that 
the simplex tableau starting (or ending) each iteration be in proper form from Gaussian 
elimination. This form is where the column for each basic variable contains only one 
nonzero coefficient, and this coefficient is a 1 in the row for this basic variable. (The 
columns for nonbasic variables can be anything.) Notice how the x3, x4, and xs 
columns (as well as the Z column).in Table 4.3 fit this special pattern. Consequently, 
each equation contains exactly one basic variable with a nonzero coefficient, where 
this coefficient is 1, so this. basic variable equals the constant on the right-hand side 
of its equation. (Remember that the nonbasic variables equal zero.) 

Under our current assumptions (stated: at the beginning of the chapter) about the 
form of the original model, the initial simplex tableau is automatically in proper form 
from Gaussian elimination. When the simplex method moves from the current basic 
feasible solution to the next one, part 3 of the iterative step uses Gaussian elimination 
to restore this form for the new solution. 

The simplex method develops a simplex tableau for each new basic feasible 
solution obtained until an optimal solution is reached. The procedure, which is just a 
tabular representation of the algebraic procedure presented in Sec. 4.3, is outlined 
next. (Tie-breaking considerations are deferred to Sec. 4.5.) We continue to use the 
Wyndor Glass Co. example for illustrative purposes. 


INITIALIZATION STEP: Introduce slack variables. If the model is not in the form 
being assumed in this section, see Sec. 4.6 for the necessary adjustments. Otherwise, 
select the original variables to be the initial nonbasic variables (set equal to zero) 
and the slack variables to be the initial basic variables. 

This selection yields the initial simplex tableau for the example already shown 


Table 4.3 Initial Simplex Tableau for Wyndor Glass Co. 
Problem 


Basic 
Variable 
Z 
X3 
X4 
Xs 














in Table 4.3, so the initial basic feasible solution is (0, 0, 4, 12, 18). Go next to the 
optimality test to determine if this solution is optimal. 


OPTIMALITY TEST: The current basic feasible solution is optimal if and only if 
every coefficient in Eq. (0) is nonnegative (= 0). If it is, stop; otherwise, go to the 
iterative step to obtain the next basic feasible solution, which involves changing one 
nonbasic variable to a basic variable (part 1) and vice versa (part 2) and then solving 
for the new solution (part 3). 

The example has two negative coefficients in Eq. (0), —3 for x, and —5 for 
Xz, SO go to the iterative step. 


ITERATIVE STEP 
Part 1: Determine the entering basic variable by selecting the variable (automatically 
a nonbasic variable) with the negative coefficient having the largest absolute value in 
Eq. (0). Put a box around the column below this coefficient, and call this the pivot 
column. 

In the example, the largest (in absolute terms) negative coefficient is —5 for 
x (5 > 3), so x, is to be changed to a basic variable. (This change is indicated in 
Table 4.4 by the box around the x, column below —5.) 


Part 2: Determine the leaving basic variable by (a) picking out each coefficient in 
the pivot column that is strictly positive (>0), (b) dividing each of these coefficients 
into ‘‘right side’’ for the same row, (c) identifying the equation that has the smallest 
of these ratios, and (d) selecting the basic variable for this equation. (This basic 
variable is the one that reaches zero first as the entering basic variable is increased.) 
Put a box around this equation’s row in the tableau to the right of the Z column, and 
call the boxed row the pivot row. (Hereafter, we continue to use the term row to 
refer just to a row of numbers to the right of the Z column, including the right-side 
number, and we label the rows by the numbers in the Eg. No. column.) Also call the 
one number that is in both boxes the pivot number. 

The results of parts 1 and 2 for the example are shown in Table 4.4, where the 
minimum ratio test for determining the leaving basic variable is shown to the right 
of the tableau. The row 1 coefficient in the pivot column is 0, so the only two strictly 
positive coefficients are in rows 2 and 3. The ratios for these rows are 6 and 9, 
respectively, so the minimum ratio of 6 identifies row 2 as the pivot row (with 2 as 
the pivot number). Consequently, the leaving basic variable is x,, the basic variable 
for row 2 shown in the first column. 


Part 3: Determine the new basic feasible solution by constructing a new simplex 
tableau in proper form from Gaussian elimination below the current one. (The first 


Table 4.4 Calculations to Determine First Leaving Basic Variable 
for Wyndor Glass Co. Problem 









Right 
Side Ratio 





6 <— minimum 
9 


z 
4 
oil 
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three columns are left unchanged except that the leaving basic variable in the Basic 
Variable column is replaced by the entering basic variable.) To change the coefficient 
of the new basic variable in the pivot row to 1, divide the entire row by the pivot 
number, so 

old pivot row 


New pivot row = — : 
pivot number 


For the example, because the old pivot row is the boxed row 2 in the first tableau 
of Table 4.5, and the pivot number is 2, applying this formula yields the new pivot 
row shown as row 2 in the second tableau in Table 4.5. 

To complete the first iteration, we need to continue using Gaussian elimination 
to obtain a coefficient of 0 for the new basic variable x, in the other rows (including 
row 0) of this second tableau. Because row 1 already has a coefficient of O for x, in 
the first tableau, this row can be carried along to the second tableau without any 
change. However, rows 0 and 3 have pivot column coefficients of —5 and 2, respec- 
tively, so each of these rows needs to be changed by using the following formula: 


New row = old row — (pivot column coefficient X new pivot row). 


Alternatively, when the pivot column coefficient is negative (as for row 0), a more ` 
convenient form of this formula is: 


New row = old row + [(— pivot column coefficient) xX new pivot row]. 


To illustrate, the missing rows for the second tableau of Table 4.5 are obtained 
as follows: 





Row 0: [-3 -5 0 0 0, 0] 
+5) [ 0 1 0 4 0, 6] 
New row 0 = [-3 0 0 š% 0, 30]. 

Row 1: Unchanged because its pivot column coefficient is zero. 
Row 3: [ 3 2 0 0 1, 18] 
(2) 0 1 0 ə 0, 6] 
New row 3 = [ 3 0 0-1 1, 6]. 


These calculations yield the new tableau shown in Table 4.6 for iteration 1. 


Table 4.5 Simplex Tableaux for Wyndor Glass Co. Problem after 
Revising First Pivot Row 




















Basic | Eq. Coefficient of Right 
Iteration Variable | No. | Z xı X% X% X44 xX; | Side 
z o [ıl -3 -5s o 0o o] o0 
x% ılo ı lol ı 0o o| 4 
9 Xs 2 lol o Z o | 
Xs 3 lol 3 bBo 0 1 i8 
z o 14 
1 X3 1 0 
x5 2 | 0 0 1 0 + 0 6 
Xs 3 0 














Table 4.6 First Two Simplex Tableaux for Wyndor Glass Co. Problem 






Coefficient of 




















































Basic 
Iteration Variable X X3 X4 
Z o [ıl -3 -5s 0o 0 0 
x ı lola Mı o 4 
g Xe shoe b R o ı 12 
Xs 3 lol 3 Bl o o 18 
z o ļ1ı -3 0 0 3ł 30 
x {oye ıı ree. o 4 
: oes 2 lol o 1 0 4 6 
Xs 3 lol 3 0 0 -=i 6 








Because each basic variable always equals the right side of its equation, the 
new basic feasible solution is (0, 6, 4, 0, 6), with Z = 30. 

This work completes the iterative step, so we next return to the optimality test 
to check if the new basic feasible solution is optimal. Since the new row 0 still has 
a negative coefficient (—3 for x,), the solution is not optimal, and so at least one 
more iteration is needed. 


ITERATION 2 FOR EXAMPLE: The second iteration starts anew from the second 
tableau of Table 4.6 to find the next basic feasible solution. Following the instructions 
for parts 1 and 2, we find x, as the entering basic variable and x; as the leaving basic 
variable, as shown in Table 4.7. 

Using the pivot number 3, the calculations to obtain the new tableau are 








Row 3: Because this is the pivot row, 
Newrow3 = 4[ 3 0 0 -1 1, 6] 
= [1 0 0 -4 4, 2]. 
Row 0: [-3 0 0 3 0, 30] 
+3f 1 0 0 -4 4 2] 
New row 0 = [0 0 0 #2 l, 36]. 
Row 1: u 0 1 0 0 4 
=A 0 0 -3 3, 2] 
Newrowl= [0 0 1 4 -4 QU. 
Row 2: Unchanged because its pivot column coefficient is zero. 


Table 4.7 Calculations to Determine Second Leaving Basic Variable for Wyndor 
Glass Co. Problem 





















Basic | Eq. Coefficient of Right 
Iteration Varijable 3 xX, XX X4 Xs | Side Ratio 
Z 30 
X3 4 į=4 
1 Xa 6 
Xs 6 $ = 2 <— minimum 
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Table 4.8 Complete Set of Simplex Tableaux for Wyndor Glass Co. 















































Problem 
Basic Eq. Coefficient of 
Iteration Variable | No. | Z x, X43 
a o }1]/—-3 -5 0 
X3 1 0 1 0 1 
0 Xa 2 0 0 2 0 
x5 3 0 3 a) 0 
Z 0 1 | -3 0 o0 3 
1 X3 1 0 1 0 1 0 
%2 2 joj jo] ı1 o0 } 
Xs 3 0 3 0 0 —]1 
a oļıļ| o 0o 0 ¢ 
%3 ı jo] 0 oO ı 4 
2 X 2. 0 0 1 0 4 
ži 3 |0 1 0 0 =4 











We now have the set of tableaux shown in Table 4.8. Therefore, the new. basic 
feasible solution is (2, 6, 2, 0, 0), with Z = 36. Going to the optimality test, we find 
that this solution is optimal because none of the coefficients in row 0 is negative, so 
the algorithm is finished. Consequently, the optimal solution to the Wyndor Glass 
Co. problem (before introducing slack variables) is x; = 2, x, = 6. 

Now compare Table 4.8 with the work done in Sec. 4.3 to verify that these two 
forms of the simplex method really are equivalent. Then note how the tabular form 
organizes the work being done in a considerably more convenient and compact form. 
We generally will use the tabular form hereafter. 


4.5 Tie Breaking in the Simplex Method 


You may have noticed in the preceding two sections that we never said what to do if 
the various choice rules of the simplex method do not lead to a clear-cut decision, 
either because of ties or other similar ambiguities. We discuss these details now. 


Tie for the Entering Basic Variable 


Part 1 of the iterative step chooses the nonbasic variable having the negative coefficient 
with the largest absolute value in the current Eq. (0) as the entering basic variable. 
Now suppose that two or more nonbasic variables are tied for having the largest 
negative coefficient (in absolute terms). For example, this would occur in the first 
iteration for the Wyndor Glass Co. problem (see Sec. 3.1) if its objective function 
were changed to Z = 3x, + 3x, so that the initial Eq. (0) becomes Z — 3x, — 
3x, = 0. How should this tie be broken? 

The answer is that the selection between these contenders may be made arbi- 
trarily. The optimal solution will be reached eventually, regardless of the tied variable 
chosen, and there is no convenient method for predicting in advance which choice 
will lead there sooner. In this example, the simplex method happens to reach the 
optimal solution (2, 6) in three iterations with x, as the initial entering basic variable, 
versus two iterations if x, is chosen. 


Tie for the Leaving Basic Variable—Degeneracy 


Now suppose that two or more basic variables tie for being the leaving basic variable 
in part 2 of the iterative step. Does it matter which one is chosen? Theoretically it 
does, and in a very critical way, because of the following sequence of events that 
could occur. First, all of the tied basic variables reach zero simultaneously as the 
entering basic variable is increased. Therefore, the one or ones not chosen to be the 
leaving basic variable also will have a value of zero in the new basic feasible solution. 
(Basic variables with a value of zero are called degenerate, and the same term is 
applied to the corresponding basic feasible solution.) Second, if one of these degen- 
erate basic variables retains its value of zero until it is chosen at a subsequent iteration 
to be a leaving basic variable, the corresponding entering basic variable must also 
remain zero (since it cannot be increased without making the leaving basic variable 
negative), so the value of Z must remain unchanged. Third, if Z may remain the same 
rather than increase at each iteration, the simplex method may then go around in a 
loop, repeating the same sequence of solutions periodically rather than eventually 
increasing Z toward an optimal solution. In fact, examples have been artificially 
constructed so that they do become entrapped in just such a perpetual loop. 

Fortunately, although a perpetual loop is theoretically possible, it has rarely 
been known to occur in practical problems. If a loop were to occur, one could always 
get out of it by changing the choice of the leaving basic variable. Furthermore, special 
rules! have been constructed for breaking ties so that such loops are always avoided. 
However, these rules have been virtually ignored in actual application, and they will 
not be repeated here. For your purposes, just break this kind of tie arbitrarily and 
proceed without worrying about the degenerate basic variables that result. 


No Leaving Basic Variable—Unbounded Z 


In part 2 of the iterative step there is one other possible outcome that we have not yet 
discussed, namely, that no variable qualifies to be the leaving basic variable.” This 
outcome would occur if the entering basic variable could be increased indefinitely 
without giving negative values to any of the current basic variables. In the tabular 
form, this means that every coefficient in the pivot column (excluding row 0) is either 
negative or zero. This situation is illustrated in Table 4.9 by deleting the last two 
functional constraints of the Wyndor Glass Co. problem (note the effect in Fig. 4.1). 


Table 4.9 Initial Simplex Tableau for Wyndor Glass Co. 
Problem without Last Two Functional Constraints 















Basic 
Variable 


Right 
Side Ratio 

















No minimum 





' See, for example, A. Charnes: ‘‘Optimality and Degeneracy in Linear Programming,’’ Econometrica, 
20:160-170, 1952. 


? Note that the analogous case (no entering basic variable) cannot occur in part 1 of the iterative step, 
because the optimality test would stop the algorithm first by indicating that an optimal solution has been 
reached. 
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The interpretation of a tableau like the one shown in Table 4.9 is. that the 
constraints do not prevent increasing the value of the objective function (Z) indefi- 
nitely, so the simplex method would stop with the message that Z is unbounded. 
Because even linear programming has not discovered a way of making infinite profits, 
the real message for practical problems is that a mistake has been made! The model 
probably has been misformulated, either by omitting relevant constraints or by stating 
them incorrectly. Alternatively, a computational mistake may have occurred. 


Multiple Optimal Solutions 


We mentioned in Sec. 3.2 (under the definition of optimal solution) that a problem 
can have more than one optimal solution. This fact was illustrated by changing the 
objective function in the Wyndor Glass Co. problem to Z = 3x, + 2x), so that every 
point on the line segment between (2, 6) and (4, 3) is optimal. We also noted in Sec. 
4.1 that every such problem has at least two optimal corner-point feasible solutions 
(basic feasible solutions). By taking weighted averages, these solutions can be used 
to identify every other optimal solution (as described in Probs. 13 and 14). 

The simplex method automatically stops after finding one optimal solution. 
However, for many applications of linear programming, there are intangible factors 
not incorporated into the model that can be used to make meaningful choices between 
solutions that are alternative optimal solutions according to the model. In such cases, 
these other optimal solutions should be identified as well. After the simplex method 
finds one optimal basic feasible solution, how do you recognize when there are others, 
and how do you find them? The answer is summarized as follows: 


Whenever a problem has more than one optimal basic feasible solution, at least one of 
the nonbasic variables has a coefficient of zero in the final Eq. (0), so increasing any 
such variable would not change the value of Z. Therefore, these other optimal basic 


Table 4.10 Complete Set of Simplex Tableaux to. Obtain All Optimal Basic Feasible 
Solutions for Wyndor Glass Co. Problem with c, = 2 


Coefficient of 



























































Basic Solution 
Iteration Variable Z X3 Optimal? 
Z 0 1 3 0 0 0 No 
x3 1 0 1| 1 0 0 
0 x 2 | 0 g 0 1 0 
Xs 3 0 3 0 0 1 
Z 0 1 0 3 0 0 No 
x i 0 1 1 0 0 
1 % 2 | 0 0 0 1 0 
Xs 3 0 0 -3 0 1 
Z 0 1 0 0 0 1 Yes 
x 1 0 1 | 1] 0 0 
2 x 2 |o] fo a oh el 
Xo 3 0 0 -l 0 $ 
z i 0 1 0 0 0 1 Yes 
x 1 0 1 0 =} 4 
Extra X 2 |o 0 1 oy 
x 3 0 0 0 4 0 





feasible solutions can be identified (if desired) by performing additional iterations 
of the simplex method, each time choosing a nonbasic variable with a zero coefficient 
as the entering basic variable. 


To illustrate, consider again the case just mentioned, where the objective func- 
tion in the Wyndor Glass Co. problem is changed to Z = 3x, + 2x,. The simplex 
method obtains the first three tableaux shown in Table 4.10 and stops with an optimal 
basic feasible solution. However, because a nonbasic variable (x3) then has a zero 
coefficient in row 0, we perform one more iteration in Table 4.10 to identify the other 
optimal basic feasible solution. Thus the two optimal basic feasible solutions are 
(4, 3, 0, 6, 0) and (2, 6, 2, 0, 0), each yielding Z = 18. Notice that the last tableau 
also has a nonbasic variable (x,) with a zero coefficient in Eq. (0). This situation is 
inevitable because the extra iteration(s) does not change row 0, so each leaving basic 
variable necessarily retains its zero coefficient. Making x, an entering basic variable 
now would only lead back to the third tableau. (Check this.) Therefore, these two are 
the only basic feasible solutions that are optimal, and all other optimal solutions are 
a weighted average of these two. Specifically, let a and (1 — «) denote the weights 
on these two solutions, where « must be some number between 0 and 1. Then every 
optimal solution is given by the vector formula a(4, 3,0,6,0) + (1 — a) 
(2, 6, 2,0, 0) for O0 = a = 1. [If the slack variables are ignored, this is just the 
formula for the line segment between (2, 6) and (4, 3) in Fig. 4.1.] 
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4.6 Adapting to Other Model Forms 


Thus far we have presented the details of the simplex method under the assumption 
that the problem is in our standard form (maximize Z subject to functional constraints 
in = form and nonnegativity constraints on all variables), and that b; > 0 for alli = 
1, 2, ...., m. In this section we point out how to make the adjustments required for 
other legitimate forms of the linear programming model. You will see that all these 
adjustments can be made in the initialization step, so the rest of the simplex method 
can then be applied just as you have learned it already. 

The only real problem that the other forms for functional constraints (=, =, or 
b; = 0) introduce is in identifying an initial basic feasible solution. Before, this initial 
solution was found very conveniently by letting the slack variables be the initial basic 
variables, so that each one just equals the positive right-hand side of its equation. 
Now, something else must be done. The standard approach that is used for all these 
cases is the artificial variable technique. This technique constructs a more convenient 
revised problem by introducing a dummy variable (called an artificial variable) into 
each constraint that needs one. This new variable is introduced just for the purpose 
of being the initial basic variable for that equation. The usual nonnegativity constraints 
are placed on these variables, and the objective function also is modified to impose 
an exorbitant penalty on their having values larger than zero. The iterations of the 
simplex method then automatically force the artificial variables to disappear (become 
zero), one at a time until they are all gone, after which the real problem is solved. 

To illustrate the artificial variable technique, we first consider the case where 
the only nonstandard form in the problem is the presence of one or more equality 
constraints. 
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Equality Constraints 


Any equality constraint, 
AyX1 + apx + +++ + AyxX, = bip 


actually is equivalent to a pair of inequality constraints: 
AX + Apg + 00+ + AinX, S b; 


CRESI + Ai Xa re E n e Ain¥n = b;. 


However, rather than making this substitution and thereby increasing the number of 
constraints, it is more convenient to use the artificial variable technique described 
next. 

Suppose that the Wyndor Glass Co. problem in Sec. 3.1 is modified to require 
that Plant 3 be used at full capacity. The only resulting change in the linear program- 
ming model is that the third constraint, 3x, + 2x, = 18, instead becomes an equality 
constraint, l 

3x, + 2x, = 18. 


Therefore, the feasible region for this problem (see Fig. 3.2) now consists of just the 
line segment connecting (2, 6) and (4, 3). 

After introducing the slack variables still needed for the inequality constraints, 
the augmented form of the problem (see the end of Sec. 4.2) becomes 


(0) Z — 3x, — 5x, = 0 
(1) xy + Xx; = 4 
(2) 2X5 +x, = 12 
(3) 3x, + 2x, = 18. 


Unfortunately, these equations do not have an obvious initial basic feasible solution 
because there is no longer a slack variable to use as the initial basic variable for 
Eq. (3). The artificial variable technique circumvents this difficulty by introducing a 
nonnegative artificial variable (call it x;)! into this equation, just as if it were a slack 
variable! Thus the technique revises the problem.by changing Eq. (3) to 
(3) 3x, + 2x, + X; = 18, 
along with the nonnegativity constraint, 
x; = 0, 

just as we had in the version of the Wyndor Glass Co. problem presented in Sec. 3.1. 
Proceeding as before, we now have an initial basic feasible solution (for the revised 
problem), (xi, X3, X3, X4, X5) = (0, 0, 4, 12, 18). 

The effect of introducing an artificial variable is to enlarge the feasible region. 
In this case, the feasible region expands from just the line segment connecting (2, 6) 
and (4, 3) to the entire shaded area shown in Fig. 3.2. A feasible solution for the 
revised problem (with 3x, + 2x, + x; = 18 and x; = 0) is also feasible for the 
original problem (with 3x, + 2x, = 18) if the artificial variable equals zero (x; = 
0). 

Now suppose that the simplex method is permitted to proceed and obtain an 
optimal solution for the revised problem and that this solution happens to be feasible 


1 We shall always label the artificial variables by putting a bar over them. 


for the original problem. It can then be concluded that this solution must also be 
optimal for the original problem, so we are finished. (The reason is that this solution 
is the best one in the entire feasible region for the revised problem, which includes 
the feasible region for the original problem.) 

Unfortunately, there is no guarantee that the optimal solution to the revised 
problem also will be feasible for the original problem; that is, there is no guarantee 
until another revision is made. Using the Big M method, this new revision amounts 
to assigning such an overwhelming penalty to being outside the feasible region for the 
original problem that the optimal solution to the revised problem must lie within this 
region. Recall that the revised problem coincides with the original problem when 
xX; = 0. Therefore, if the original objective function, Z = 3x, + 5x2, is changed to 


Z= 3x; + 5x3 ai Mxs, 


where M denotes some huge positive number, then the maximum value of Z must 
occur when ¥; = 0 (x, cannot be negative). After a little more setting up (discussed 
next), applying the simplex method to this revised problem automatically leads to the 
desired solution. 

Using this revised objective function, Eq. (0) becomes 


(0) Z aaa 3x gg 5x2 + Mx; = 0, 
or in tabular form, the preliminary row O (call it Rọ) becomes 
Ro = [-3 -5 0 0 M, 0]. 


However, this Ry cannot be used in the initial tableau for applying the simplex method 
because it is not in proper form from Gaussian elimination. This proper form requires, 
in part, that every basic variable (excluding Z, which only pretends to be a basic 
variable) has been eliminated from Eq. (0), whereas x, now is a basic variable with 
a coefficient of M. Restoring this proper form is essential for both the optimality test 
and the procedure for determining the entering basic variable. This form normally is 
provided automatically by the Gaussian elimination part of the iterative step, and 
Gaussian elimination is used now to restore the form. Proceeding as if the column 
for the artificial variable (x;) were the pivot column and the equation containing this 
variable (row 3) were the pivot row, the calculations are shown below. 





Row 0: E oa -5 0 0 M, 0] 
- MI 3 2 0 0 1, 18] 
New row 0 = KL 3M- 3), (-2M—5), 0 0 0, —18M]. 


This completes the additional work required in the initialization step for prob- 
lems of this type, and the rest of the simplex method proceeds just as before. The 
quantities involving M never appear anywhere except in row 0, so they need to be 
taken into account only in the optimality test and when determining an entering basic 
variable. One way of dealing with these quantities is to assign some particular (huge) 
numerical value to M and use the resulting numbers in row 0 in the usual way. 
However, this approach may result in significant round-off errors that invalidate the 
optimality test. Therefore, it is better to do what we have just shown, namely, express 
each coefficient in row 0 as a linear function aM + b of the symbolic quantity M by 
separately recording and updating the current numerical value of (1) the multiplicative 
factor a and (2) the additive factor b. Because M is assumed to be so large that b 
always is negligible compared with aM when a = 0, the decisions in the optimality 
test and the choice of the entering basic variable are made by using just the multipli- 
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cative factors in the usual way. The one exception is when this use leads to a tie 
[where a tie for the optimality test means that the smallest multiplicative factor(s) 
equals zero], in which case the tie would be broken by using the corresponding additive 
factors. 

Using this approach on the example yields the simplex tableaux shown in Table 
4.11. The optimal solution (x, = 2, x, = 6) is the same as for the first version of 
the Wyndor Glass Co. problem (see Table 4.8 for its tableaux). However, a different 
sequence of basic feasible solutions was obtained because:a comparison of the initial 
multiplicative factors (3 > 2) led to choosing x, rather than x, as the initial entering 
basic variable. If the equality constraint had been 3x, + 3x, = 18 instead, both 
multiplicative factors would have been —3, so then a comparison of the additive 
factors (5 > 3) would have led to choosing x, as before. 

This example involved only one equality constraint. If a linear programming 
model has more than one, each would be handled in just this same way. [If the right- 
hand side is negative, multiply through both sides by (— 1) first.] Thus each equality 
constraint would be given an artificial variable to serve as its initial basic variable, 
each of these variables would be assigned a coefficient of — M in the objective function 
[or + M when the variable is brought to the left-hand side of Eq. (0)], and the resulting 
row 0 would have subtracted from it M times each equality constraint row. 

The approach to other kinds of constraints requiring artificial variables is com- 
pletely analogous. To illustrate the adjustments for a variety of different forms, we 
will use the model for designing Mary’s radiation therapy, as presented in Sec. 3.4. 


For your convenience, this model is repeated below. 
RADIATION THERAPY EXAMPLE 
Minimize Z = 0.4x, + 0.5%, 


subject to 0.3x, + O.1x, = 2.7 
0.5x, + 0.5x, = 6 


0.6x, + 0.4x, = 6 


Table 4.11 Complete Set of Simplex Tableaux for Wyndor Glass Co. Problem with an Equality Constraint 






Basic 


Iteration Variable 








Coefficient of 


















































































0 1 (-3M — 3) (-2M — 5) 0 0 0 —18M 

1 0 1| 0 1 0 0 4] 

2 0 p 2 0 1 0 12 

3 0 3 2 0 0 1 18 

0 1 0 (-2M—5) (GM + 3) 0 0 -6M + 12 

1 0 1 0 1 0 0 4 

2 0 0 2 0 1 0 12 

3 0 0 2 -3 0 1 6 
= 

0 1 0 0 -$ 0 M+ 27 

1 0 1 0 [a 0 0 4 

2 0 0 0 3 1 -1 6 

3 0 0 1 -3 0 3 

0 1 0 3 

1 0 1 3 

2 0 0 3 

3 0 0 3 
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The graphical solution for this example (originally presented in Fig. 3.7) ellen 
is repeated here in a slightly different form in Fig. 4.2. The three lines in the fig- OA eats 8 
ure, along with the two axes, constitute the five constraint lines of the problem. The The Simplex Method 


dots lying at the intersection of a pair of constraint lines are the corner-point solu- 
tions. The only two corner-point feasible solutions are (6, 6) and (7.5, 4.5), and the 
feasible region is the line segment connecting these two points. The optimal solution 
is (X,, X2) = (7.5, 4.5), with Z = 5.25. 

We soon will show how the simplex method solves this problem in its entirety, 
including the sequence of corner-point solutions obtained. However, we first must 








Dots = corner-point solutions 
Dark line segment = feasible region 
Optimal solution = (7.5, 4.5) 


10 0.6x, + 04x, > 6 


0.3x, + Ox, < 2.7 0.5x, + 0.5x, = 6 








0 5 10 X 
Figure 4.2 Graphical display of the radiation therapy example and its corner-point solutions. 
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convert the model into an appropriate form for applying the simplex method. Two 
adjustments are to introduce a slack variable (x, = 0) into the first constraint, 


0.3x, + O.lx, S 2.7 > 0.3x, + 0.x, + x, = 2.7, 
and an artificial variable (x, = 0) into the equality constraint, 
0.5x, + 0.5x, a 6 = 0.5x, + 0.5x, + X4 = 6. 


The temporary effect of introducing x, into this constraint (before the Big M method 
forces X, to be zero) is to allow solutions such that 0.5x, + 0.5x, = 6. Considering 
the other constraints as well (see Fig. 4.2), the feasible region for the revised problem 
is thereby expanded to include the entire triangle whose vertices are (6, 6), (7.5, 4.5), 
and (8, 3). 

The remaining adjustments for the other nonstandard forms in the model are 
described next. 


= Inequality Constraints 


The direction of an inequality always is reversed when both sides are multiplied by 
(—1). As a result, any functional constraint of the = form can be converted into an 
equivalent constraint of our standard = form by changing the signs of all the numbers 
on both sides. 

Using this approach for the third constraint of the radiation therapy example, 


0.6x, + 0.4x, z= 6 
—> -—0.6x, — 0.4x, = -6 
—> —0.6x, — 0.4%, + x5 = —6. 


where x, is the slack variable for this constraint. However, one more change is still 
needed, as you will see next. 


Negative Right-Hand Sides 


You may recall that the simplex method was presented in the preceding sections under 
the assumption that b; > 0O for alli = 1,2, ... , m. This assumption enabled us to 
select the slack variables to be the initial basic variables (equal to the right-hand sides) 
and still obtain a nondegenerate basic feasible solution. We have since pointed out in 
Sec. 4.5 that degeneracy (basic variables equal to zero) does not need to be avoided. 
However, a negative right-hand side, such as in the third constraint, 


—0.6x, — 0.4%, + x, = —6, 


would give a negative value for the slack variable (x; = ~—6) in the initial solution 
(where x, = 0, x, = 0), which violates the nonnegativity constraint for this variable. 
Multiplying through the equation by (— 1) makes the right-hand side positive: 


0.6x, + 0.4x, — x5 = 6, 


but it also changes the coefficient of the slack variable to —1, so the variable still 
would be negative. However, in this form the constraint can be viewed as an equality 
constraint with a nonnegative right-hand side, so the artificial variable technique can 
be applied just as described earlier. If we let x, be the nonnegative artificial variable 


for this constraint, its final form becomes 
0.6x, + 0.4%, ~ x5 + Xe = 6, 


where X, is used as the initial basic variable (x = 6) for this equation and x; begins 
as a nonbasic variable. The Big M method also would be applied just as before, as 
we shall demonstrate shortly. 

As usual, introducing this artificial variable enlarges the feasible region. The 
original constraint allowed only solutions lying above or on the constraint boundary, 
0.6x, + 0.4x, = 6. Now it allows any solution lying below this constraint boundary 
as well, because both x; and ¥ę are constrained only to be nonnegative so their 
difference (xe — xs) can be any positive or negative number. No solutions are dis- 
allowed at all, so the temporary effect of introducing x, has been to eliminate this 
constraint in the revised problem. (We keep the constraint in the system of equations 
only because it will become relevant again later, after the Big M method forces xX, to 
be zero.) 

We now have twice revised the original problem by expanding its feasible 
region, first by introducing the artificial variable x, into the equality constraint 
(0.5x,; + 0.5x. + x, = 6), and now by introducing x,. Consequently, the feasible 
region for the revised problem is the entire polyhedron in Fig. 4.2 whose vertices are 
(0, 0), (9, 0), (7.5, 4.5), and (0, 12). 

You may have noticed that we took a somewhat circuitous route in converting 
the third constraint from its original form, 0.6x, + 0.4x, = 6, to its final version, 
0.6x, + 0.4x, — x5 + Xe = 6. In fact, we multiplied through the constraint by (— 1) 
twice along the way! Now that you have seen the motivation leading to the final form, 
we should point out the following shortcut: 


0.6x, + 0.4x, =6 
—> 0.6x, + 0.4x, — Xs = 0 (x; = 0) 
—> 0.6x, + 0.4x, — x5 + Xg = 6 (x; = 0, x6 = 0). 


In this form, x; is called a surplus variable because it subtracts the surplus of the 
left-hand side over the right-hand side to convert the inequality into an equivalent 
equation. 


Minimization 

One straightforward way of minimizing Z with the simplex method is to exchange the 
roles of the positive and negative coefficients in row O for both the optimality test 
and part 1 of the iterative step. However, rather than changing our instructions for 


the simplex method, we instead present the following simple way of converting any 
minimization problem into an equivalent maximization problem: 


n 
> oy 
j=l 


Minimizing Z 


is equivalent to 


il 


maximizing (—Z) 


D> (eps; 
j=l 


that is, the two formulations yield the same optimal solution(s). 
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The reason the two formulations are equivalent is that the smaller Z is, the larger 
(—Z) is, so the solution that gives the smallest value of Z in the entire feasible region 
must also give the largest value of (— Z) in this region. 
Therefore, in the radiation therapy example, we make the following change in 
the formulation: 
Minimize Z = 0.4x, + 0.5x, 


— Maximize (-Z) = —0.4x, = 0.5x). 





After introducing artificial variables (%4, x6) and then applying the Big M method, the 
corresponding conversion is 


Minimize Z = 0.4x, + 0.5x, + Mx, + MX6 
—> Maximize (-Z) = —0.4x, — 0.5x, — Mx, — Mx. 


Solving the Radiation Therapy Example 


We now are nearly ready to apply the simplex method to the radiation therapy ex- 
ample. Using the maximization form just obtained, the entire system of equations is 
now 


(0) -Z + 0.4x, + 0.5x, + Mx, + Mx, = 0 
(1) 0.3x, + O.1x, + x3 = 2.7 
(2) 0.5x, + 0.5x, + xX = 6 
(3) 0.6x, + 0.4x, —X, +t Xp = 6. 


The basic variables (x3, X4, X;) for the initial basic feasible solution (for this revised 
problem) are shown in boldface. 

Do you see any problem with this system of equations for starting the simplex 
method? Can the optimality test and the procedure for selecting an entering basic 
variable be applied to this Eq. (0)? No! The problem is that the system of equations 
is not yet in proper form from Gaussian elimination (where each basic variable has 
been eliminated from every equation except its own equation, for which it has a 
coefficient of + 1). Equations (1) to (3) are fine, but the basic variables x, and X¢ still 
need to be eliminated from Eq. (0) by Gaussian elimination. Because x, and x, both 
have a coefficient of M, Eq. (0) needs to have subtracted from it both M times 
Eq. (2) and M times Eq. (3). For example, the coefficient of x, in Eq. (0) becomes 
0.4 — 0.5M — 0.6M = —1.1M + 0.4, so that M is treated as a real (huge) number 
whose value is fixed but unspecified. The calculations for all of the coefficients (and 
the right-hand side) are summarized below, where the vectors are the relevant rows 
of the simplex tableau corresponding to the above system of equations. 





Row 0: [ 0.4 0.5 0 M 0 M, 0 ] 
-M I[ 0.5 0.5 O 1 0 0, 6 ] 

-M [ 0.6 0.4 0 0-1 1, 6 ] 

New row 0 = {((-1.1M + 0.4), (—0.9M + 0.5),0 0 M 0, —12M] 


The resulting initial simplex tableau, ready to begin the simplex method, is 
shown at the top of Table 4.12. Applying the simplex method in just the usual way 
then yields the sequence of simplex tableaux shown in the rest of Table 4.12. For the 


Table 4.12 The Big M Method for the Radiation Therapy Example 



























































































Basic | Eq. Coefficient of Right 
Iteration Variable | No.| Z xy Xa X3 Ta X5 Xs Side 
ee 

0 |~1/(-1.1M + 0.4) (—0.9M + 0.5) 0 0 M 0 —12M 

1 0.3 0.1 1 0 0 0 2.7 

2 0.5 0.5 0 1 0 0 6 

3 0.6 0.4 0 0 -1 1 6 

0 0 (3M + 45) (HM - 9) 0 M 0 —2.1M — 3.6 

1 1 p 10 0 0 0 9 

2 Dire celine 3 -$ 1 0 0 1.5 

3 oO 0.2 72 0 -1 1 0.6 

ee 

0 0 0 (-3M + 4) 0 (-3M + 43) GM — 41)|-0.5M — 4.7 

1 1 0 a 0 3 -3 8 

2 0 0 p 1 al -$ 0.5 

3 0 1 —10 0 -5 5 3 

0 0 0 0.5 (M— 1.1) 0 M —5,25 

1 1 0 5 -1 0 0 7.5 

2 0 0 1 0.6 1 -1 0.3 

3 0 1 -5 3 0 0 4.5 








optimality test and the selection of the entering basic variable at each iteration, the 
quantities involving M are treated just as discussed in connection with Table 4.11. 
Specifically, whenever M is present, only its multiplicative factor is used, unless there 
is a tie, in which case the tie is broken by using the corresponding additive factors. 
Just such a tie occurs in the last selection of an entering basic variable (see the next- 
to-last tableau), where the coefficients of x, and x, in row 0 both have the same 
multiplicative factor, — 3. Comparing the additive factors, 4 < $, leads to choosing 
x; as the entering basic variable. 

Now see what the Big M method has done graphically in Fig. 4.2. Using just 
the original decision variables (x,, xz), the sequence of corner-point solutions obtained 
in Table 4.12 is: 

(0, 0) 





> (9, 0) > (8, 3) > (7.5, 4.5). 


For the first two corner-point solutions, both x, and X, are greater than zero, indicating 
that both 0.5x, + 0.5x, = 6 and 0.6x, + 0.4x, = 6 are violated. The Big M method 
then succeeds in driving X, to zero at (8, 3), so that 0.6x, + 0.4x, = 6 is satisfied. 
Next, x, also is driven to zero at (7.5, 4.5), so that 0.5x, + 0.5x, = 6 also is 
satisfied, and the first feasible solution for the original problem has been obtained. 
Fortuitously, this first feasible solution also is optimal, so no additional iterations are 
needed. 

For other problems with artificial variables, it may be necessary to perform 
additional iterations to reach an optimal solution after obtaining the first feasible 
solution for the original problem. (This was the case for the example solved in Table 
4.11.) Thus the Big M method can be thought of as having two phases. In the first 
phase, all of the artificial variables are driven to zero (because of the penalty of M 
per unit for being greater than zero) in order to reach an initial basic feasible solution 
for the original problem. In the second phase, all of the artificial variables are kept 
at zero (because of this same penalty) while the simplex method generates a sequence 
of basic feasible solutions leading to an optimal solution. The two-phase method 
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described next is a streamlined procedure for performing these two phases directly, 
without even introducing M explicitly. 


The Two-Phase Method 


For the radiation therapy example just solved in Table 4.12, the Big M method uses 
the following objective function (or its equivalent in maximization form) throughout 
the entire procedure. 


Big M method: Minimize Z = 0.4x, + 0.5x, + MX, + MX. 


By contrast, the two-phase method is able to drop M by using two different objective 
functions. 


Two-phase method: 
Phase 1: Minimize Z = x, + x (until x, = 0, x, = 0). 
Phase 2: Minimize Z = 0.4x, + 0.5x, (with x, = 0, X¥, = 0). 


Before solving the example in this way, let us summarize the general method. 


SUMMARY OF THE TWo-PHASE METHOD 
Initialization Step: Revise the constraints of the original problem by introducing 
artificial variables as needed to obtain an obvious initial basic feasible solution for 
the revised problem. 

Phase 1: Use the simplex method to solve the linear programming problem: 


Minimize Z = sum of artificial variables, subject to the revised constraints. 


The optimal solution obtained for this problem (with Z = 0) will be a basic feasible 
solution for-the original problem. 

Phase 2: Drop the artificial variables (they are all zero now anyway).! Starting 
from the basic feasible solution obtained at the end of Phase 1, use the simplex method 
to solve the original problem. 


Table 4.13 shows the result of applying Phase 1 to the radiation therapy ex- 
ample. [Row 0 in the initial tableau is obtained by converting Minimize Z = x, + 
Xg to Maximize (—Z) = — X4 — Xg, and then using Gaussian elimination to eliminate 
x, and x, from —Z + X, + ¥ę = 0.] In the next-to-last tableau, there is a tie for the 
entering basic variable between x, and x;, which is broken arbitrarily in favor of x3. 
The solution obtained at the end of Phase 1, then, is (x1; X2, X3, X4, Xs, X6) = (6, 6, 
0.3, 0, 0, 0) or, after dropping x, and Xe, (x1, X2, X3, X5) = (6, 6, 0.3, 0). 

As claimed in the Summary, this solution from Phase | is indeed a basic feasible 
solution for the original problem because it is the solution (after setting x; = 0) to 
the original constraints in augmented form, 


(2) 0.5x, + 0.5x, = 6 
(3) 0.6x, + 0.4x, — Xs = 6. 


1 We are skipping over three other possibilities here: (1) artificial variables > O (discussed in the next 
subsection), (2) artificial variables that are degenerate basic variables, and (3) retaining the artificial variables 
as nonbasic variables in Phase 2 (and not allowing them to become basic) as an aid to subsequent sensitivity 
analysis. Your OR COURSEWARE allows you to explore these possibilities. 


Table 4.13 Phase 1 of the Two-Phase Method for the Radiation Therapy Example 






Coefficient of 
































































Basic 
Iteration Variable Z x X2 X3 X4 X5 Xe | Side 

1 0 0 
0 I 0 2.7| 

o 0 0 1 
0 0 0 
1 : ka 0 1 -2.1 
0 3 E 0 0 9 

1 0 4 =f 1 0 L5 
0 0.2! -2 QO. -1 0.6 
1 0 0 -$ 0o -$ -0.5 
0 1 0 ES 0 g  - 8 

2 o | [0 0 5 1 es 0.5] 
0 0 1 -10| 0 -5 3 

Z -1 0 0 1 0 

0 1 0 -4 +5 
0 0 1 š 1 
0 0 0 6 5 





In fact, after deleting the x, and x, columns, Table 4.13 shows one way of using 
Gaussian elimination to solve this system of equations by reducing the system to the 
form displayed in the final tableau. 

For Phase 2, x, and x, are dropped. To start the simplex method from the basic 
feasible solution, (x), x2, x3, xs) = (6, 6, 0.3, 0), rows 1-3 of the final tableau in 
Table 4.13 are already in proper form from Gaussian elimination. However, we now 
need to insert into row 0 the objective function for the original problem in this same 
proper form. The sequence of steps to obtain this new row 0 (including using rows 1 


and 3 to eliminate x, and x, from row 0) is shown below: 
Minimize Z= 04x, + 0.5x, 


— Maximize (—Z) = —0.4x, — 0.5x, 





-> (0) =Z + 0.4%, + 0.5x, = 0. 
Row 0: [0.4 0.5 0 0, 0] 
-0.41 0 0 -5, 6 ] 
-0.50 1 0 5, 6 | 
New row 0 = [0 0 0 0.5, —5.4]. 


The resulting initial simplex tableau for Phase 2 is shown at the top of Table 
4.14. Applying the simplex method then leads in one iteration to the optimal solution 
shown in the second tableau, (x1, x2, x3, xs) = (7.5, 4.5, 0, 0.3). 

Now see what the two-phase method has done graphically in Fig. 4.2. Using 
just (x,, x2), the sequence of corner-point solutions obtained in Tables 4.13 and 4.14 
is 


Phase 1: (0, 0) — (9, 0) —> (8, 3) — (6, 6) 
Phase 2: (6, 6) — (7.5, 4.5). 


Note that all of these solutions in Phase 1 are infeasible (except for the revised 
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Table 4.14 Phase 2 of the Two-Phase Method for the Radiation Therapy Example 


















































Basic Eq. Coefficient of Right 
Iteration Variable No. xı X X3 Xs Side 
Z 0 0 0 0 —5.4 
x, 1 0 0 
o Xs 0 0 1 
Xz 0 1 0 
0 0 0. 
1 0 5 
0 0 1 
0 1 -5 








problem) until the last one. Phase 2 then deals only with corner-point feasible 
solutions. 

If the tie for the entering basic variable in the next-to-last tableau of Table 4.13 
had been broken in the other way, then Phase 1 would have gone directly from (8, 3) 
to (7.5, 4.5). After using (7.5, 4.5) to set up the initial simplex tableau for Phase 2, 
the optimality test would have revealed that this solution is optimal, so no iterations 
would be done. 


It is interesting to compare the Big M and two-phase methods. Begin with their 
objective functions. 


Big M method: Minimize Z = 0.4x, + 0.5x, + Mx, + MXg. 
Two-phase method: 

Phase 1: Minimize Z = X, + Xe- 

Phase 2: Minimize ` Z = 0.4x, + 0.5x. 


Because the Mx, and Mx, terms dominate the 0.4x, and 0.5x, terms in the objective 
function for the Big M method, this objective function is essentially equivalent to the 
Phase 1 objective function as long as x, and/or X, are greater than zero. Then, when 
both x, = 0 and x, = 0, the objective function for the Big M method becomes 
completely equivalent to the Phase 2 objective function. 

Because of these virtual equivalencies in objective functions, the Big M and 
two-phase methods generally have the same sequence of basic feasible solutions. The 
one possible exception is when a tie for the entering basic variable occurs in Phase 1 
of the two-phase method, as happened in the third tableau of Table 4.13. Notice that 
the first three tableaux of Tables 4.12 and 4.13 are almost identical, with the only 
difference being that the multiplicative factors of M in Table 4.12 become the sole 
quantities in the corresponding spots in Table 4.13. Consequently, the additive factors 
that broke the tie for the entering basic variable in the third tableau of Table 4.12 
were not present to break this same tie in Table 4.13. The result for this example was 
an extra iteration for the two-phase method. Generally, however, the advantage of 
having the additive factors is minimal. 

The two-phase method streamlines the Big M method by using only the multi- 
plicative factors in Phase 1 and by dropping the artificial variables in Phase 2. (The 
Big M method could combine the multiplicative and additive factors by assigning an 
actual huge number to M, but this might create numerical instability problems.) For 
these reasons, the two-phase method is commonly used in computer codes. 
























































No Feasible Solutions 91 
So far in this section we have been concerned primarily with the fundamental problem Solving Linear 
of identifying an initial basic feasible solution when an obvious one is not available. ears 
You have seen how the artificial variable technique constructs an artificial problem The sfuiptex methad 
and obtains an initial basic feasible solution for the revised problem instead. Either 
the Big M method or the two-phase method then enables the simplex method to begin 
its pilgrimage toward the basic feasible solutions, and ultimately toward the optimal 
solution, for the original problem. 
However, you should be wary of a certain pitfall with this approach. There may 
be no obvious choice for the initial basic feasible solution for the very good reason 
that there are no feasible solutions at all! Nevertheless, by constructing an artificial 
feasible solution, there is nothing to prevent the simplex method from proceeding as 
usual and ultimately reporting a supposedly optimal solution. 
Fortunately, the artificial variable technique provides the following signpost to 
indicate when this has happened: 
If the original problem has no feasible solutions, then either the Big M method or 
Phase 1 of the two-phase method yields a final solution that has at least one artificial 
variable greater than zero. Otherwise, they all equal zero. 
To illustrate, let us change the first constraint in the radiation therapy example 
(see Fig. 4.2) as follows: 
0.3x, + 0.x, 52.7 — 0.3x, + O.1x, = 1.8, 
so that the problem no longer has any feasible solutions. Applying the Big M method 
just as before (see Table 4.12) yields the tableaux shown in Table 4.15. (Phase 1 of 
the two-phase method yields the same tableaux except that each expression involving 
M is replaced by just the multiplicative factor.) Hence the Big M method normally 
would be indicating that the optimal solution is (3, 9, 0, 0, 0, 0.6). However, since 
an artificial variable x, = 0.6 > 0, the real message here is that the problem has no 
feasible solutions. 
Table 4.15 The Big M Method for the Revision of the Radiation Therapy Example That Has No Feasible Solutions 
5 Basic Coefficient of Right 
Iteration Variable : xy +2 4 Xs Side 
Zz 0 —1 | (1.1M + 0.4 (—0.9M + 0.5) 0 0 M 0 —12M 
3 X3 1 0.3 0.1 1 0 0 0 1.8] 
X4 2 0 0.5 0.5 0 1 0 0 6 
Xe 3 0.6 0.4 0 0 -1 1 6 
Z 0 |-i1 0 CHM +3) GM -% 0 M 0 | -5.4M — 2.4 
l x, 1 0 1 [3 | = 0 0 0 6 
X4 2 0 0 4 | =3 1 0 0 3 
ig 3| o 0 0.2| -2 0 -1 1 2.4 
Z 0 -1 0 0 (M + 0.5) (1.6M — 1.1) M 0 | —0.6M — 5.7 
2 Xi 1 0 1 0 5 ~1 0 0 3 
X% 2 0 0 1 =5 3 0 0 9 
Te 3 0 0 0 =] —0.6 -1 1 0.6 
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Variables Allowed to Be Negative 


In most practical problems, negative values for the decision variables would have no 
physical meaning, so it is necessary to include nonnegativity constraints in the for- 
mulations of their linear programming models. However, this is not always the case. 
To illustrate, suppose that the Wyndor Glass Co. problem is changed so that product 
1 already is in production, and the first decision variable x, represents the increase in 
its production rate. Therefore, a negative value of x, would indicate that product 1 is 
to be cut back by that amount. Such reductions might be desirable to allow a larger 
production rate for the new, more profitable product 2, so negative values should be 
allowed for x, in the model. 

Since the procedure for determining the leaving basic variable requires that all 
the variables have nonnegativity constraints, any problem containing variables allowed 
to be negative must be converted into an equivalent problem involving only non- 
negative variables before applying the simplex method. Fortunately, this conversion 
can be done. The modification required for each variable depends upon whether or 
not it has a (negative) lower bound on the values allowed. Each of these two cases 
is now discussed. 


VARIABLES WITH A BOUND ON THE NEGATIVE VALUES ALLOWED: Consider any 
decision variable x; that is allowed to have negative values that satisfy a constraint of 
the form 


>L, 
x; = L, 


where L; is some negative constant. This constraint can be converted into a non- 
negativity constraint by making the change of variables, 


$ T 
= x% 5 > 
xX; X; L; so x} = 0. 


Thus (x; + L,) would be substituted for x; throughout the model, so that the redefined 
decision variable x; cannot be negative. 

To illustrate, suppose that the current production rate for product 1 in the Wyn- 
dor Glass Co. problem is 10. With the definition of x, just given, the complete model 
at this point is the same as that given in Sec. 3.1 except that the nonnegativity 
constraint, x; = 0, is replaced by 


x, 2 —10. 


To obtain the equivalent model needed for the simplex method, this decision variable 
would be redefined as the total production rate of product 1, 


xi = x, + 10, 


which yields the changes in the objective function and constraints as shown: 














Z = 3x, + 5x, Z = 34; — 10) + 5x, Z= —30 + 3x, + 5x 
x <4 (xi — 10) <4 x <14 
2x, = 12 2x, = 12 2x, = 12 
3x, + 2x, = 18 34 — 10) + 2x, = 18 3x + 2x, = 48 
x, = -10,x,=0 (x, — 10) = -10,x,=0 x, 20,x, 20 








VARIABLES WITH NO BOUND ON THE NEGATIVE VALUES ALLOWED: In the case 
where x; does not have a lower bound constraint in the model formulated, another 


approach is required: x; is replaced throughout the model by the difference of two new 
nonnegative variables, 


— + 
Xj = Xj 


eae where xj =0, x7 =0. 
Since x;* and x; can have any nonnegative values, this difference GS x; ) can 
have any value (positive or negative), so it is a legitimate substitute for x, in the 
model. But after such substitutions, the simplex method can proceed with just non- 
negative variables. 

The new variables, x" and x; , have a simple interpretation. By the geometric 
definition of corner-point feasible solution (see Sec. 5.1), each basic feasible solution 
for the new form of the model necessarily has the property that either xF = 0 or 


xy = 0 (or both). Therefore, at the optimal solution obtained by the simplex method, 


yt a 1 if x; =0 
0, otherwise; 


-_ Jl, ifx,=0 
J 0, otherwise; 


so that xj represents the positive part of the decision variable x, and x7 its negative 
part (as suggested by the superscripts). 

To illustrate this approach, let us use the same example as for the bounded 
variable case. However, now suppose that the x, = — 10 constraint was not included 
in the original model because it clearly would not change the optimal solution. (In 
some problems, certain variables do not need explicit lower bound constraints because 
the functional constraints already prevent lower values.) Therefore, before applying 
the simplex method, x, would be replaced by the difference, 














x, = af - 4x7. where xf} = 0, x7 = 0, 
as shown: 
Maximize Z = 3x, + 5x, Maximize Z = 3x} — 3xy + 5x, 
x, = 4 Ito a = 4 
Dt 12 |g 2x, = 12 
3x, + 2x, = 18 3x — 3x7 + 2x, = 18 
x> = 0 (only) xf 20,x7 20,% 20 





From a computational viewpoint, this approach has the disadvantage that the 
new equivalent model to be used has more variables than the original model. In fact, 
if all the original variables lack lower bound constraints, the new model will have 
twice as many variables. Fortunately, the approach can be modified slightly so that 
the number of variables is increased by only one, regardless of how many original 
variables need to be replaced. This modification is done by replacing each such vari- 
able x; by 


KSA eS where x; = 0, x” = 0, 


instead, where x” is the same variable for all relevant j. The interpretation of x” in 
this case is that-—x” is the current value of the largest (in absolute terms) negative 
original variable, so that x; is the amount by which x, exceeds this value. Thus the 
simplex method now can make some of the x; variables larger than zero even when 
x">0. 


93 


Solving Linear 
Programming 
Problems: 

The Simplex Method 


4.7 Post-Optimality Analysis 


We stressed in Secs. 2.3, 2.4, and 2.5 that post-optimality analysis —the analysis done 
after an optimal solution is obtained for the initial version of the model—constitutes 
a very major and very important part of most operations research studies. The fact 
that post-optimality analysis is very important is particularly true for typical linear 
programming applications. In this section, we focus on the role of the simplex method 
in performing this analysis. 

Table 4.16 summarizes the typical steps in post-optimality analysis for linear 
programming studies. The last column of Table 4.16 identifies some algorithmic 
techniques that involve the simplex method. These techniques are introduced briefly 
here with the technical details deferred to later chapters. 


Reoptimization 


After having found an optimal solution for one version of a linear programming model, 
we frequently must solve again (often many times) for a slightly different version of 
the model. We nearly always have to solve again several times during the model 
debugging stage (described in Secs. 2.3 and 2.4), and we usually have to do so a 
large number of times during the later stages of post-optimality analysis as well. 

One approach is simply to reapply the simplex method from scratch for each 
new version of the model, even though each run may require hundreds or even thou- 
sands of iterations for large problems. However, a much more efficient approach is to 
reoptimize. Reoptimization involves deducing how changes in the model get carried 
along to the final simplex tableau (as described in Secs. 5.3 and 6.6). This revised 
tableau and the optimal solution for the prior model are then used as the initial tableau 
and the initial basic solution for solving the new model: If this solution is feasible 
for the new model, then the simplex method is applied in the usual way, starting from 
this initial basic feasible solution. If the solution is not feasible, a related algorithm 
called the dual simplex method (described in Sec. 9.2) probably can be applied to 
find the new optimal solution,’ starting from this initial basic solution. 


Table 4.16 Post-Optimality Analysis for Linear Programming 





Task Purpose Technique 
Model debugging Find errors and weaknesses in model Reoptimization 
Model validation Demonstrate validity of final model See Sec. 2.4 
Final managerial Make appropriate division of organizational Shadow prices 
decisions on resource resources between activities under study 
allocations (the ,) and other important activities 
Evaluate estimates of Determine crucial estimates that may affect Sensitivity analysis 
model parameters optimal solution for further study 
Evaluate trade-offs Determine best trade-off Parametric linear 
between model programming 
parameters 








1 The one requirement for using the dual simplex method here is that the optimality test still passes when 
applied to row 0 of the revised final tableau. If not, then still another algorithm called the primal-dual 
method can be used instead. 
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The big advantage of this reoptimization technique over resolving from scratch 
is that an optimal solution for the revised model probably is going to be much closer 
to the prior optimal solution than to an initial basic feasible solution constructed in 
the usual way for the simplex method. Therefore, assuming that the model revisions 
were modest, only a few iterations should be required to reoptimize instead of the 
hundreds or thousands that may be required when starting from scratch. In fact, the 
optimal solutions for the prior and revised models are frequently the same, in which 
case the reoptimization technique requires only one application of the optimality test 
and no iterations. 


Shadow Prices 


Recall (see Tables 3.2 and 3.3) that linear programming problems typically can be 
interpreted as allocating resources to activities, where the b, represent the amounts of 
the respective resources being made available for the activities under consideration. 
In many cases, there may be some latitude in the amounts that will be made available. 
If so, the b; used in the initial (validated) model actually may represent management’s 
tentative initial decision on how much of the organization’s resources will be provided 
to the activities considered in the model instead of to other important activities under 
the purview of management. From this broader perspective, some of the b; can be 
increased in a revised model, but only if a sufficiently strong case can be made to 
management that this revision would be beneficial. 

Consequently, information on the economic contribution of the resources to the 
measure of performance (Z) for the current study often would be extremely useful. 
The simplex method provides this information in the form of ‘‘shadow prices” for 
the respective resources. 


The shadow price for resource i (denoted by y¥) measures the marginal value of this 
resource, that is, the rate at which Z could be increased by (slightly) increasing the 
amount of this resource (b,) being made available.! The simplex method identifies this 
shadow price by yf = coefficient of the ith slack variable in row 0 of the final 
simplex tableau. 


To illustrate, for the Wyndor Glass Co. problem, the final tableau in Table 4.8 
yields 


yí = 0 = shadow price for resource 1, 
y3 = = shadow price for resource 2, 
y3 = 1 = shadow price for resource 3, 


where these resources are the available production capacities of Plants 1, 2, and 3, 
respectively (b, = 4, b = 12, and b; = 18). You can verify that these numbers are 
correct by checking in Figs. 3.2 and 3.3 that individually increasing each b; by 1 
indeed would increase the optimal value of Z by y*. For example, Fig. 4.3 demon- 
strates this increase for resource 2 by reapplying the graphical procedure presented in 
Sec. 3.2. The optimal solution, (2, 6) with Z = 36, changes to (3, 47) with Z = 372 


' The increase in b; must be sufficiently small that the current set of basic variables remains optimal since 
this rate (marginal value) changes if the set of basic variables changes. 
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Figure 4.3 
Illustration of shadow 
price for resource 2 
for Wyndor Glass 
Co. problem. 


when b, is increased by 1 (from 12 to 13), so that 





yt = AZ = 374 — 36 = 3. 


Figure 4.3 demonstrates that y; = 3 is the rate at which Z could be increased 
by increasing b, slightly. However, it also demonstrates the common phenomenon 
that this interpretation holds only for a small increase in b,. Once b, is increased 
beyond 18, the optimal solution stays at (0, 9) with no further increase in Z. (At that 
point, the set of basic variables in the optimal solution has changed, so a new final 
simplex tableau would be obtained with new shadow prices, including yž = 0.) 

Now note in Fig. 4.3 why yf = 0. Because the constraint on resource 1, 
x, S 4, is not binding on the optimal solution, (2, 6), there is a surplus of this resource. 
Therefore, increasing b, beyond 4 cannot yield a new optimal solution with a larger 
value of Z. 

By contrast, the constraints on resources 2 and 3, 2x, = 12 and 3x, + 2x, = 
18, are binding constraints (constraints that hold with equality at the optimal solu- 
tion). Because the limited supply of these resources (b, = 12, b, = 18) binds Z from 
being increased further, they have positive shadow prices. Economists refer to such 
resources as scarce goods, whereas resources available in surplus (such as resource 
1) are free goods (zero shadow price). 

The kind of information provided by shadow prices clearly is valuable to man- 
agement when it considers reallocations of resources within the organization. It also 
is very helpful when an increase in b; can be achieved only by going outside the 
organization to purchase more of the resource in the marketplace. For example, sup- 
pose that Z represents profit and the unit profits of the activities (the c;) include the 
costs (at regular prices) of all the resources consumed. Then a positive shadow price 
of y* for resource i means that the total profit Z can be increased by y+ by purchasing 
one more unit of this resource at its regular price. Alternatively, if a premium price 


| 
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must be paid for the resource in the marketplace, then y¥ represents the maximum 
premium (excess over the regular price) that would be worth paying. 

In the Wyndor Glass Co. problem, management has ruled out any expansion of 
the production capacity in the three plants ‘at this time. Nevertheless, the available 
capacities allocated to the two new products can be increased by cutting back further 
on the current product line. The OR Department actually investigates this possibility 
as part of the sensitivity analysis study in Sec. 6.7. 

The theoretical foundation for shadow prices is provided by the duality theory 
described in Chap. 6. 


Sensitivity Analysis 


When discussing the certainty assumption for linear programming at the end of Sec. 
3.3, we pointed out that the values used for the model parameters (the a;,, b;, and c; 
identified in Table 3.3) generally are just estimates of quantities whose true values 
will not become known until the linear programming study is implemented at some 
time in the future. The general purpose of sensitivity analysis is to identify the sen- 
sitive parameters (i.c., those that cannot be changed without changing the optimal 
solution), to try to estimate these parameters more closely, and then to select a solution 
that remains a good one over the range of likely values of the sensitive parameters. 

How are the sensitive parameters identified? In the case of the b,, you have just 
seen that this information is given by the shadow prices provided by the simplex 
method. In particular, if yf > 0, then the optimal solution changes if b; is changed, 
so b; is a sensitive parameter. However, y; = 0 implies that the optimal solution is 
not sensitive to at least small changes in b;. Consequently, if the value used for b; is 
an estimate of the amount of the resource that will be available (rather than a mana- 
gerial decision), then the b; that need to be estimated more closely are those with 
positive shadow prices—especially those with large shadow prices. 

When there are just two variables, the sensitivity of the various parameters can 
be analyzed graphically. For example, in Fig. 4.3 (or Fig. 3.3), c} = 3 can be changed 
to any other value within the range from 0 to 74 without the optimal solution changing 
from (2, 6). (The. reason is that any value of c, within this range keeps the slope of 
Z = cx; + 5x, between the slopes of the 2x, = 12 and 3x, + 2x, = 18 lines.) 
Similarly, if c, = 5 is the only parameter changed, it can have any value greater 
than 2 without affecting the optimal solution. Hence’ neither c, nor c, is a sensitive 
parameter. : 

The easiest way to analyze the sensitivity of each of the a, parameters graphi- 
cally is to check if the corresponding constraint is binding on the optimal solution. 
Because x, = 4 is not a binding constraint, any sufficiently small change in its 
coefficients (a; = 1, aj. = 0) is not going to change the optimal solution, so these 
are not sensitive parameters. On the other hand, both 2x, = 12 and 3x, + 2x, = 18 
are binding constraints, so changing any one of their coefficients (a, = 0, ay = 2, 
a3, = 3, a} = 2) is going to change the optimal solution, and therefore these are 
sensitive parameters. 

Typically, greater attention is given to performing sensitivity analysis on the b; 
and c; parameters than on the a; parameters. On real problems with hundreds or 
thousands of constraints and variables, the effect of changing one a,j is usually neg- 
ligible, but changing one b; or c; can have real impact. Furthermore, in many cases, 
the a; values are determined by the technology being used (the a; are sometimes 
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called technological coefficients), so there may be relatively little (or no) uncertainty 
about their final values. This is fortunate, because there are far more a,; parameters 
than b; and c, parameters for large problems. 

For problems with more than two variables, you cannot analyze the sensitivity 
of the parameters graphically as was. just done. for the Wyndor Glass Co. problem. 
However, you can extract the same kind of information from the simplex method. 
Getting this information requires using the fundamental insight described in Sec. 5.3 
to deduce the changes that get carried along to the final simplex tableau as a result of 
changing the value of a parameter in the original model. The rest of the procedure is 
described in Sec. 6.6. 


Parametric Linear Programming 


Sensitivity analysis involves changing one parameter at a time in the original model 
to check its effect on the optimal solution. By contrast, parametric linear program- 
ming (or parametric programming for. short) involves the systematic study of how 
the optimal solution changes as many of the parameters change simultaneously over 
some range. This study can provide a very useful extension of sensitivity analysis, 
e.g., to check the effect of ‘‘correlated’’ parameters that change together due to 
exogenous factors such as the state of the economy. However, a more important 
application is the investigation of trade-offs in parameter values. For example, if the 
c; represent the unit profits of the respective activities, it may be possible to increase 
some of the c; at the expense of decreasing others by an appropriate shifting of 
personnel and equipment among activities. Similarly. if the b, represent the amounts 
of the respective resources being made available, it may be possible to increase some 
of the b, by agreeing to accept decreases in some of the others. 

In some applications, the main purpose of the study is to determine the most 
appropriate trade-off between two basic factors, such as costs and benefits. The usual 
approach is to express one of these factors in the objective function (e.g., minimize 
total cost) and incorporate the other into the constraints (e.g., benefits = minimum 
acceptable level), as was done for the Nori & Leets Co. air pollution problem in Sec. 
3.4. Parametric linear programming then enables systematic investigation of what 
happens when the initial tentative decision on the trade-off (e.g., the minimum ac- 
ceptable level for the benefits) is changed by improving one factor at the expense of 
the other. This approach is illustrated by the case study in Sec. 8.5, where the two 
basic factors are the distance traveled by high school students and the degree of racial 
balance achieved in their schools. 

The algorithmic technique for parametric linear programming is a natural ex- 
tension of that for sensitivity analysis, so it too is based on the simplex method. The 
procedure is described in Sec. 9.3. 


4.8 Computer Implementation 


Computer codes for the simplex method now are widely available for essentially all 
modern computer systems. In fact, major computer manufacturers usually supply their 
customers with a rather sophisticated linear programming software package (Mathe- 
matical Programming System) that also includes many of the special procedures de- 
scribed in Chaps. 6, 7, and 9 (including the algorithmic techniques introduced in the 


preceding section). Other very good linear programming systems also have been de- 
veloped by independent software development companies and service bureaus, and 
further progress continues to be made. 

These production computer codes do not closely follow either the algebraic 
form or the tabular form of the simplex method presented in Secs. 4.3 and 4.4. These 
forms can be streamlined considerably for computer implementation. Therefore, the 
codes use instead a matrix form (usually called the revised simplex method) that is 
especially well suited for the computer. This form accomplishes exactly the same 
things as the algebraic or tabular form, but it does this while computing and storing 
only the numbers that are actually needed for the current iteration, and then it carries 
along the essential data in a more compact form. The revised simplex method is 
described in Sec. 5.2. 

The available software packages are used routinely to solve surprisingly large 
linear programming problems. For example, a problem with 5,000 functional con- 
straints and 10,000 variables usually can be solved in less than an hour on a mainframe 
computer of recent vintage. Problems with several times this number of constraints 
and variables also have been successfully solved by the general simplex method. If 
the problem has some kind of special structure (as described in Chap. 7) that can be 
solved by a streamlined version of the simplex method, then even much larger sizes 
can sometimes be handled. For example, a problem with 100,000 functional con- 
straints and 500,000 variables has been solved when all but about 1,000 of these 
constraints were of a special kind (generalized upper bound constraints discussed at 
the end of Sec. 7.5). Even this one is far from the largest when highly specialized 
types of linear programming problems (typically network flow problems) are consid- 
ered. 

Several factors affect how long it will take the general simplex method to solve 
a linear programming problem. The most important one is the number of ordinary 
functional constraints. In fact, computation time tends to be roughly proportional to 
the cube of this number, so that doubling this number may multiply the computation 
time by a factor of approximately 8. By contrast, the number of variables is a relatively 
minor factor.” Thus doubling the number of variables probably will not even double 
the computation time. A third factor of some importance is the density of the table of 
constraint coefficients (i.e., the proportion of the coefficients that are not zero), be- 
cause this affects the computation time per iteration. One common rule of thumb for 
the number of iterations is that it tends to be roughly twice the number of functional 
constraints. 

One difficulty in dealing with large linear programming problems is the tre- 
mendous amount of data involved. For example, a problem with just 1,000 functional 
constraints and variables would have 1 million constraint coefficients to be specified! 
Therefore, most experienced practitioners make extensive use of the computer for 
data-processing purposes both before and after applying the simplex method. Fre- 
quently, a matrix generator program will be written to convert the basic raw data 
into constraint coefficients in an appropriate format for the simplex method. The matrix 


1 On problems of this size, the computation time depends greatly upon the linear programming system 
being used because large savings can be achieved by using special techniques (e.g., crashing techniques 
for quickly finding an advanced initial basic feasible solution). When problems are resolved periodically 
after minor updating of the data, much time often is saved by using (or modifying) the last optimal solution 
to provide the initial basic solution for the new run. 


? This statement assumes that the revised simplex method described in Sec. 5.2 is being used. 
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generator will do the arithmetic required in this conversion, repeat constraints of a 
recurring type, and fill in the zero coefficients (most of the coefficients usually are 
zeroes in large problems). It also should print out the key input data in an easily 
readable form so that they can be shown to various people for checking and correcting. 
Another useful function of a matrix generator is to scale the coefficients (by changing 
the units for the activities or resources) to approximately the same order of magnitude 
to avoid significant round-off error. 

For many of the same reasons, it often is helpful to write an output analyzer 
program to convert the output of the simplex method into a useful form. An output 
analyzer (or report writer) has three major functions. Two of these are to compile 
and summarize relevant information for two of the post-optimality analysis tasks 
introduced in the preceding section, namely, debugging the model and sensitivity 
analysis. The third major purpose is to develop a well-organized report presenting the 
relevant information about the proposed solution in the vernacular of management. 


4.9 New Developments 


Two crucial events have been primarily responsible for the tremendous impact of 
linear programming in recent decades. One was the invention of the remarkably ef- 
ficient simplex method in 1947 by George Dantzig.'! The second crucial event was 
the computer revolution that makes it possible for the simplex method to solve such 
huge problems. 


A Powerful New Algorithm 


Now there has been a dramatic new development that promises to give further impetus 
to the impact of linear programming. In 1984, Narendra Karmarkar of AT&T Bell 
Laboratories published a landmark paper” announcing a new algorithm for solving 
huge linear programming problems. In contrast to the simplex method’s approach of 
focusing on corner-point feasible solutions on the boundary of the feasible region, 
Karmarkar’s algorithm is an interior-point algorithm that cuts through the interior 
of the feasible region to reach an optimal solution. 

The initial claims were that this new algorithm can solve huge linear program- 
ming problems up to 50 times as fast as the simplex method. Not surprisingly, this 
dramatic announcement became front-page news in The New York Times, etc. How- 
ever, because of proprietary considerations, no code was made available, and only 
sketchy details about the implementation of the algorithm were released, so it was not 
possible for others to check these claims. For the next four years, few additional 
details were forthcoming. Meanwhile, independent investigators were attempting to 
develop sophisticated computer implementations of the algorithm and reporting a 
variety of mixed results about comparisons with the simplex method. The entire 
operations research community was abuzz with excitement and controversy! 

Then, in 1988, came another dramatic announcement. AT&T Bell Laboratories 
were releasing a powerful computer implementation of variants of Karmarkar’s al- 


1 Since 1966, Professor Dantzig has been a colleague of the authors in the Department of Operations 
Research at Stanford University, and has continued to lead in the development of linear programming. 


? Karmarkar, Narendra: ‘‘A New Polynomial-Time Algorithm for Linear Programming,” Combinatorica, 
4: 373-395, 1984. 


gorithm for commercial distribution. Called the AT&T KORBX Linear Programming 
System, it is a large-scale implementation of the algorithm on a parallel-vector mini- 
supercomputer (the Alliant FX8 supercomputer). Each installation of the entire system 
(including the dedicated computer) was available initially for approximately 
$8,900,000. 

Coincident with this announcement, AT&T Bell Laboratories released substan- 
tial information on their computational experience with their system, including com- 
parisons with a standard computer implementation of the simplex method (not nec- 
essarily the most efficient available for each problem run) called MINOS.! They 
reported successfully solving some very large problems with many thousand, or even 
many tens of thousands of functional constraints, including some too large for imple- 
mentation by the simplex method. For some highly specialized problems, the sizes 
are even larger. For some of the larger problems, improvements in running time over 
the version of the simplex method used were great—factors of 10 to 50 were common. 

So how does Karmarkar’s interior-point algorithm compare in efficiency with 
the simplex method? This is a complicated question, because it requires a comparison 
for many different sizes and classes of linear programming problems. No definitive 
answer can be given at this time. We still need a disinterested party (or two competing 
parties) to conduct a comprehensive program of comparative testing with the most 
sophisticated computer implementations available for the two algorithms. Until this 
happens, the jury will remain out. 

This much can be said now. Both algorithms are here to stay. We anticipate 
that they will play vital complementary roles in linear programming throughout your 
career. To define these roles, we need to point out a key advantage of each algorithm. 
The simplex method is ideally suited for post-optimality analysis as described in Sec. 
4.7, whereas the interior-point algorithm cannot perform this analysis efficiently (ex- 
cept for obtaining shadow prices). On the other hand, the key advantage of the interior- 
point algorithm is that its rate of growth of computation time as the problem size 
grows frequently is somewhat less than for the simplex method. According to current 
evidence, the high setup time of the interior-point algorithm prevents it from being 
strongly competitive with the simplex method for relatively small problems (tens or 
perhaps hundreds of functional constraints). However, in at least some cases, it tends 
to gain ever-increasing superiority on larger and larger problems (thousands or tens 
of thousands of functional constraints). Experience may differ with different types of 
linear programming problems. 

One other meaningful way of comparing the two algorithms is to examine their 
theoretical properties regarding computational complexity. Karmarkar has proven that 
the original version of his algorithm is a polynomial time algorithm; i.e., the time 
required to solve any linear programming problem can be bounded above by a poly- 
nomial function of the size of the problem. Pathological counterexamples have been 
constructed to demonstrate that the simplex method does not possess this property, so 


' MINOS was developed largely in the Systems Optimization Laboratory of the Department of Operations 
Research of Stanford University. Perhaps better known as an optimizer for nonlinear programming, it is a 
state-of-the-art FORTRAN-based system that is widely used throughout the world. The performance of 
MINOS is controlled by a number of system parameters or ‘‘options’’ that enable fine-tuning for the 
characteristics of the particular problem being run. However, each option has a default value that should 
be appropriate for most problems, and these frequently were the values used by AT&T Bell Laboratories. 
We have been informed that some of the default values are different in a later version of MINOS to be 
more suitable for particularly challenging problems. 
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it is an exponential time. algorithm (i.c., the required time can only be bounded 
above by an exponential function. of problem size). This difference in worst-case 
performance is noteworthy. However, it tells us nothing about their comparison in 
average performance on real problems, which is the more crucial issue. 

Based on all evidence now available, we currently anticipate the following roles 
for the two algorithms as we approach and enter the twenty-first century. The simplex 
method should continue to be the standard algorithm for the routine use of linear 
programming. However, Karmarkar’s interior-point algorithm (or some of its variants 
and refinements) should gradually gain widespread use by heavy-duty users of linear 
programming dealing with relatively large problems. This algorithm converges toward 
an optimal solution without ever literally reaching it, so the procedure terminates with 
an arbitrarily close approximation of the desired solution. Therefore, when an interior- 
point algorithm is used to closely approximate. an optimal solution, the solution ob- 
tained probably will be converted to the nearest basic feasible solution to serve as the 
initial solution for the simplex method to finish solving to optimality and then to 
conduct post-optimality analysis. The two algorithms thereby would become a package 
for dealing with some large problems, whereas the simplex method would be used by 
itself for routine cases. 

Consequently, we believe that you probably will be involved with frequent use 
of the simplex method during your: career, whereas you are less likely to use the 
interior-point algorithm as well. If you do use both algorithms, the computer system 
containing the interior-point algorithm will be a ‘‘black box” for quickly generating 
an optimal solution for the current model, so there is little need to know much about 
this algorithm. The ‘‘hands-on’’ work then comes with applying the simplex method 
for post-optimality analysis, so it is important that you be familiar with this algorithm. 
Consequently, our algorithmic focus in this part of the book is on the simplex method 
alone. We then give an elementary introduction to the nature of interior-point algo- 
rithms in Sec. 9.4, while foregoing mathematical details that are beyond the level of 
this book. 


Improved Implementations of the Simplex Method 


One of the by-products of the development of Karmarkar’s interior-point algorithm 
has been a major renewal of efforts to improve the efficiency of computer implemen- 
tations of the simplex method. This effort has been led by companies with a strong 
commercial interest in the simplex method. For example, IBM distributes MPSX/370, 
a widely used system based on the simplex method for use on IBM mainframes. 

At the IBM Thomas J. Watson Research Center in Yorktown Heights, New 
York, a new experimental code, YKTLP, is being developed for the implementation 
of the simplex method on IBM mainframes, particularly the model 390 with Vector 
Facility. One feature of this code is the use of vector hardware to simultaneously 
compute the new coefficients of nonbasic variables in row 0 of the simplex tableau 
for the current iteration. When there are thousands of nonbasic variables, the benefit 
from such vector processing can be very substantial. Other opportunities to exploit 
vector processing to speed up each iteration also are being explored. This experimental 
code is already showing big improvements—factors of 10 to 50—over the standard 
MINOS implementation of the simplex method (the one used for the comparative 
testing by AT&T Bell Laboratories). 


Linear Programming on Microcomputers 


This is an exciting time to be introduced to linear programming for still another reason. 
We are now seeing an explosion in the capability of doing linear programming on 
microcomputers. You now can solve problems on a personal computer that only 
recently required the use of a mainframe computer, with all its inconveniences and 
expense. 

There are now dozens of companies marketing linear programming software for 
microcomputers based on the simplex method. Many of these companies are located 
in the United States, but a considerable number are scattered around the world. The 
early emphasis was on educational software, but now many of the packages are 
suitable for commercial applications on problems of rather substantial size. 

For example, the popular package LINDO can handle problems with under 2,000 
functional constraints and 4,000 variables, provided the number of nonzero coeffi- 
cients in the functional constraints does not exceed 32,000. Solving large problems 
usually requires additional memory, and the larger versions of some programs (such 
as LINDO) require a math coprocessor. 

We pointed out in Sec. 4.8 that dealing with problems of such size normally 
requires making extensive use of the computer for data-processing purposes in con- 
structing the model, etc. Mathematical modeling languages have now been developed 
for personal computers. For example, the GAMS/MINOS package is a combination 
of two well-known mainframe programs now available for IBM personal computers 
that offers a powerful algebraic modeling language for generating the constraints of 
the model automatically. (Yes, this is the same MINOS that is being used for com- 
parative testing by the research laboratories of AT&T and IBM.) The English package 
XPRESS-LP offers a similar capability. MPL (Mathematical Programming Language) 
is a modeling system from Maximal Software in Iceland that is used as a front end 
for other linear programming packages. 

The convenient data entry and editing features of spreadsheets also are very 
helpful in constructing linear programming models. Many of the current packages are 
spreadsheet-compatible, and several (e.g., VINO, What’s Best?, and XA) actually 
perform the optimization within the spreadsheet program. 

Some of the linear programming packages include extensions to other areas of 
mathematical programming. For example, LINDO includes integer programming 
(Chap. 13) and GAMS/MINOS includes nonlinear programming (Chap. 14). 

Nearly all of the linear programming packages developed in the late 1980s were 
for IBM personal computers and IBM-compatibles. However, LINDO and Turbo- 
Simplex (from Maximal Software) also are available for the Macintosh computer. 
Macintosh-type computers have an architecture better suited for the graphics-based 
nature of many linear programming applications (including those involving networks), 
and we anticipate many more packages becoming available for the Macintosh in the 
future. 

Your OR COURSEWARE (available for the first time in this edition) introduces 
you to the use of microcomputers for linear programming and other areas of operations 
research. However, this tutorial software is designed strictly for educational purposes 
while you deal with the tiny homework problems in this book. Later, when you are 
dealing with ‘‘real’’ problems, you will want to use a more powerful software package. 

The above information about linear programming on microcomputers is very 
much up to date at this writing, but we need to point out its almost instant obsoles- 
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4.10 Conclusions 


The simplex method is an efficient and reliable algorithm for solving linear program- 
ming problems. It also provides the basis for performing the various parts of post- 
optimality analysis very efficiently. 

Although it has a useful geometric interpretation, the simplex method is an 
algebraic procedure. At each iteration, it moves from the current basic feasible solution 
to a better adjacent basic feasible solution by choosing both an entering basic variable 
and a leaving basic variable, and then using Gaussian elimination to solve a system 
of linear equations. When the current solution has no adjacent basic feasible solution 
that is better, the current solution is optimal and the algorithm stops. 

We presented the full algebraic form of the simplex method to convey its logic, 
and then we streamlined the method to a more convenient tabular form. To set up for 
starting the simplex method, it is sometimes necessary to use artificial variables to 
obtain an initial basic feasible solution for a revised problem. If so, either the Big M 
method or the two-phase method is used to ensure that the simplex method obtains 
an optimal solution for the original problem. 

Microcomputer software packages based on the simplex method now are widely 
available for dealing with problems of modest size. Mainframe programs are routinely 
used to solve and analyze problems with many hundreds or even thousands of func- 
tional constraints and variables. 

Karmarkar’s interior-point algorithm provides a powerful new tool for solving 
very large problems. 
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PROBLEMS 


(For all of the following problems, when you are asked to solve by the simplex method you 
can use the interactive routine in your OR COURSEWARE and then print out your work.) 


1. Consider the linear programming model formulated for Prob. 1 of Chap. 3. 
(a) Identify all the corner-point feasible solutions for this model. 


(b) Solve by the simplex method in algebraic form. 

(c). Solve by the simplex method in tabular form. 

(d) Use the graphical solution to perform sensitivity analysis on this model; i.e., identify 
the sensitive parameters that cannot be changed without changing the optimal so- 
lution. 

2. Consider the following problem. 

Maximize Z = xX, + 2x, 
subject to x, + 3x, =8 (resource 1) 

x, + X,;=4 (resource 2) 
and x, 20, x, 20. 


(a) Solve this problem graphically. Identify all the corner-point feasible solutions for 
this model. 

(b) Solve by the simplex method in algebraic form. 

(c) Solve by the simplex method in tabular form. 

(d) Identify the shadow prices for the resources from the final tableau for the simplex 
method. Demonstrate graphically that these shadow prices are the correct ones. 

(e) Use the graphical solution to perform sensitivity analysis on this model; i.e., identify 
the sensitive parameters that cannot be changed without changing the optimal 
solution. 

3. Follow the instructions of Prob. 2 for the following problem. 

Maximize Z = 3x, + 2X2, 
subject to xy = 4 (resource 1) 
x, + 3x, = 15 (resource 2) 


2x, + x, = 10 (resource 3) 


and x, 20, x, = 0. 


4.* Consider the following problem. 
Maximize Z = 4x, + 3x, + 6x3, 
subject to 3x, + x, + 3x3 = 30 
2x, + 2x, + 3x; = 40 
and x, 20, xX, = 0, x32 0. 
(a) Solve by the simplex method in algebraic form. 
(b) Solve by the simplex method in tabular form. 
5. Use the simplex method to solve the following problem. 
Maximize Z = 2x, = xX + X3, 
subject to 3x, +X, + x3=6 
X — X_ + 2x35 1 
“txa x32 


and x, =0, x =0, x, = 0. 
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6. Use the simplex method to solve the following problem. 


Maximize Z= =x, + xXx, + 2x3, 


subject to xı + 2x, — x= 20 


—2x, + 4x, + 2x, = 60 


2x, + 3x, + x3 = 50 


and x, 20, x, 20, x32 0. 
7. Consider the following problem. 
Maximize Z = 2x, — 2x, + 3x3, 
subject to =x; +x, + x3 4 (resource 1) 
2x; — Xx, + xS 2 (resource 2) 
XxX, + x, + 3x, = 12 (resource 3) 
and x, 20, x, 2 0, x32 0. 
(a) Solve by the simplex method. 
(b) Identify the shadow prices for the three resources and describe their significance. 
8. Consider the following problem. 
Maximize Z = 2x, + 4x, — X3, 
subject to 3x. — x3 = 30 (resource 1) 
2X, — X + x3 10 (resource 2) 
4x, + 2x, — 2x3 = 40 (resource 3) 
and x, 20, x, = 0, x32 0. 
(a) Solve by the simplex method. 
(b) Identify the shadow prices for the three resources and describe their significance. 
9. Consider the following problem. 
Maximize Z = 5x, + 4x, — x3 + 3x4 
subject to 3x, + 2x, — 3x3, + x4 24 (resource 1) 
3x, + 3x, + x, + 3x4 = 36 (resource 2) 
and x, 20, X, 20, x; = 0, x4 Z 0. 


(a) Solve by the simplex method. 
(b) Identify the shadow prices for the two resources and describe their significance. 


10. Label each of the following statements as True or False, and then justify yoùt answer 


by referring to specific statements (with page citations) in the chapter. 


(a) The simplex method’s rule for choosing the entering basic variable is used because 
it always leads to the best adjacent basic feasible solution (largest Z). 

(b) The simplex method’s rule for choosing the leaving basic variable is used because 
making another choice normally would yield a basic solution that is not feasible. 

(c) When a linear programming model has an equality constraint, an artificial variable 
is introduced into this constraint in order to start the simplex method with an obvious 
initial basic solution that is feasible for the original model. 


11. Consider the following problem. 
Maximize Z = 2x, + 4x, + 3x3, 
subject to x, + 3x, + 2x; = 30 
xt X + x; S 24 
3x, + 5x, + 3x3 = 60 
and x, 20, xX, Z0, x; 20. 


You are given the information that x, > 0, x, = 0, and x; > 0 in the optimal solution. 

(a) Describe how one can use this information in order to adapt the simplex method to 
solve this problem in the minimum possible number of iterations (when starting from 
the usual initial feasible solution). Do not actually perform any iterations. 

(b) Use the procedure developed in part (a) to solve this problem. 


12. Consider the following problem. 
Maximize Z = 5x, + xa + 3x3 + 4x4, 
subject to XxX, ~ 2x + 4x, + 3x4, = 20 
—4x, + 6x, + 5x3 — 4x, = 40 
2x, — 3x. + 3x3 + 8x, = 50 
and x, 20, XxX, 20, x; Z 0, x42 0. 
Use the simplex method to demonstrate that Z is unbounded. 


13. For this problem, we will use vector notation, x = (4, X2, ...,,,), to represent 
solutions more compactly. Consider N such solutions, x®, x®, . . . , x. A weighted average 
of these N solutions is defined to be a solution x such that 


N 
x= DS agx, 
k=] 


where the weights a,, @2,..., Qy are nonnegative and sum to 1. If the feasible region is 
bounded, then every feasible solution can be expressed as a weighted average of some of the 
comer-point feasible solutions (perhaps in more than one way). Similarly, after solutions are 
augmented with slack variables, every feasible solution can be expressed as a weighted average 
of some of the basic feasible solutions. 

(a) Show that any weighted average of any set of feasible solutions must be a feasible 
solution (so that any weighted average of corner-point feasible solutions must be 
feasible). 

(b) Use the result quoted in part (a) to show that any weighted average of basic feasible 
solutions must be a feasible solution. 


14. Using the facts given in Prob. 13, show that the following statements must be true 
for any linear programming problem that has a bounded feasible region and multiple optimal 
solutions: 

(a) Every weighted average of the optimal basic feasible solutions must be optimal. 

(b) No other feasible solution can be optimal. 


15. Consider the following problem. 
Maximize Z = Xt x, + X3 + X4 
subject to x, + x, =3 


x3 +x, 52 
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x; 20, forj = 1, 2, 3, 4. 


Linear Programming Use the simplex method to find all the optimal basic feasible solutions. 


16.* Consider the following problem. 


subject to 


and 


(a) 
(b) 


(c) 
17. 


subject to 


and 


(a) 
(b) 


(c) 
18. 


subject to 


and 


(a) 


(b) 
(c) 


(d) 
(e) 
(f) 
(g) 


19. 


Maximize Z = 2x, + 3X, 
xX, + 2x, = 4 
xX + X% = 3 
x, 20, x, 20. 


Solve this problem graphically. 

Using the Big M method, construct the complete first simplex tableau for the simplex 
method and identify the corresponding initial (artificial) basic feasible solution. Also 
identify the initial entering basic variable and the leaving basic variable. 

Solve by the simplex method. 


Consider the following problem. 
Minimize Z = 3x, + 2x, 
2x, + x,= 10 
—3x, + 2x,= 6 
xX, + x2 6 
x, 20, xX, 20. 


Solve this problem graphically. 

Using the Big M method, construct the complete first simplex tableau for the simplex 
method and identify the corresponding initial (artificial) basic feasible solution. Also 
identify the initial entering basic variable and the leaving basic variable. 

Solve by the simplex method. 


Consider the following problem. 
Maximize Z = 2x, + 5x, + 3x3, 
Xx, — 2x = 20 
2x, + 4x, + x3 = 50 
x, 20, x, =0, x32 0. 


Using the Big M method, construct the complete first simplex tableau for the simplex 
method and identify the corresponding initial (artificial) basic feasible solution. Also 
identify the initial entering basic variable and the leaving basic variable. 

Solve by the simplex method. 

Using the two-phase method, construct the complete first simplex tableau for Phase 
1 and identify the corresponding initial (artificial) basic feasible solution. Also iden- 
tify the initial entering basic variable and the leaving basic variable. 

Perform Phase 1. 

Construct the complete first simplex tableau for Phase 2. 

Perform Phase 2. 

Compare the sequence of basic feasible solutions obtained in part (b) with that in 
parts (d) and (f). Which of these solutions are artificial basic feasible solutions for 
just a revised problem and which are actually feasible for the real problem? 


For the Big M method, explain why the simplex method never would choose an 


artificial variable to be an entering basic variable once all of the artificial variables are nonbasic. 


20. Consider the following problem. 
Minimize Z = 2x, + x, + 3x3, 
subject to 5x, + 2x, + 7x3 = 420 
3x, + 2x, + 5x, = 280 
and x 20, xX, 20, x, 20. 
Using the two-phase method, solve by the simplex method. 


21.* Consider the following problem. 


Maximize Z= =x, + 4x, 
subject to ~3x, + x,S 6 
xX, + 2x,S 4 
xX, z -3 


(no lower bound constraint for x,). 
(a) Solve this problem graphically. 
(b) Reformulate this problem so that it has only two functional constraints and all 
variables have nonnegativity constraints. 
(c) Solve by the simplex method. 


22. Consider the following problem. 


Maximize Z= —x, + 2x, + x3, 
subject to 3x, + x, = 120 
x x — 4x; = 80 


—3x, + x, + 2x, = 100 


(no nonnegativity constraints). 
(a) Reformulate this problem so all variables have nonnegativity constraints. 
(b) Solve by the simplex method. 


23. Consider the following problem. 
Minimize Z = 3x, + 2x, + 4x3, 
subject to 2x, + x, + 3x, = 60 
3x, + 3x, + 5x, = 120 
and x, = 0, x, = 0, x, 20. 


(a) Using the Big M method, solve by the simplex method. 

(b) Using the two-phase method, solve by the simplex method. 

(c) Compare the sequence of basic feasible solutions obtained in parts (a) and (b). Which 
of these solutions are artificial basic feasible solutions for just a revised problem 
and which are actually feasible for the real problem? 


24. This chapter has described the simplex method as applied to linear programming 
problems where the objective function is to be maximized. Section 4.6 then described how to 
convert a minimization problem into an equivalent maximization problem for applying the 
simplex method. Another option with minimization problems is to make a few modifications 
in the instructions for the simplex method given in the chapter in order to apply the algorithm 
directly. 

(a) Describe what these modifications would need to be. 
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(0) 


25. 


subject to 


and 


(a) 


(b) 
(c) 


(d) 
(e) 
(f) 
(g) 


Using the Big M method, apply the modified algorithm developed in part (a) to 
solve the following problem directly. 


Minimize Z = 3x, + 8x, + 5x3, 
subject to 3x, + 4x, = 70 
3x, + 5x, + 2x, = 70 
and x, 20, x, 20, x, 20. 
Consider the following problem. 
Maximize Z = 4x, + 2x, + 3x3 + 5x4 
2x, + 3x. + 4x3 + 2x, = 300 
8x, + x + xa + 5x, = 300 
x; 20, for j = 1, 2,3, 4. 


Using the Big M method, construct the complete first simplex tableau for the simplex 
method and identify the corresponding initial (artificial) basic feasible solution. Also 
identify the initial entering basic variable and the leaving basic variable. 

Solve by the simplex method. 

Using the two-phase method, construct the depletes first simplex tableau for Phase 
1 and identify the corresponding initial (artificial) basic feasible solution. Also iden- 
tify the initial entering basic variable and the leaving basic variable. 

Perform Phase 1. 

Construct the complete first simplex tableau for Phase 2. 

Perform Phase 2. 

Compare the sequence of the basic feasible solutions obtained in part (b) with that 
in parts (d) and (f). Which of these solutions are artificial basic feasible solutions 
for just a revised problem and which are actually feasible for the real problem? 


26.* Consider the following problem. 


subject to 


and 


(a) 


(b) 
(c) 
(a) 


27. 


subject to 


Minimize Z = 2x, + 3x, + X3, 


x, + 4x, + 2x32 8 
3x, + 2x, = 6 
x, 20, x, = 0, x32 0. 


Reformulate this problem to fit our standard form for a linear programming model 
presented in Sec. 3.2. 

Using the Big M method, solve by the simplex method. 

Using the two-phase method, solve by the simplex method. 

Compare the sequence of basic feasible solutions obtained in parts (b) and (c). Which 
of these solutions are artificial basic feasible solutions: for just a revised problem 
and which are actually feasible for the real problem? 


Consider the following problem. 
Maximize Z = ~—2x, + x, — 4x; + 3x4, 
xi t x + 3x, + 2x5 4 
Xx, = Xs xy 2S 
2x, + Xp, = 2 


xı + 2x, + x, +t 2xg= 2 


and % 20, 20, x,20 111 


(no nonnegativity constraint for x,). Solving Linear 
(a) Reformulate this problem (except for the equality constraint) to fit our standard form Programming 
for a linear programming model presented in Sec. 3.2. Problems: 


(b) Using the Big M method, construct the complete first simplex tableau for the simplex The Simplex Method 
method and identify the corresponding initial (artificial) basic feasible solution. Also 
identify the initial entering basic variable and the leaving basic variable. 
(c) Using the two-phase method, construct row 0 of the first simplex tableau for Phase 
1. 
(d) Use the automatic routine for the simplex method in your OR COURSEWARE to 
solve this problem. 


28. Consider the following problem. 
Maximize Z = 4x, + 5x, + 3x3, 
subject to X, + x, + 2x; = 20 
15x, + 6x, — 5x; = 50 
x, + 3x, + 5x, <= 30 
and x, 20, x, 20, x3 20. 
Use the simplex method to demonstrate that this problem does not possess any feasible solutions. 
29. Consider Prob. 20 of Chap. 8. The linear programming model for this problem has 
more than 5,000 functional constraints and more than 150,000 variables. 
(a) There are more than 750,000,000 coefficients for these constraints, which creates a 
storage problem for a computer solution of the model. Considering that more than 


99 percent of these coefficients are zeroes, recommend a way to alleviate this prob- 
lem. 

(b) Since the number of nonzero coefficients is well over 100,000, manually inputting 
these data into the computer would be excessively time-consuming. Considering that 
the number of items of basic raw data is much smaller, recommend a way to alleviate 
this problem. 


The Theory of the 
Simplex Method 


Chapter 4 introduced the basic mechanics of the simplex method. Now we shall delve 
a little deeper into this algorithm by examining some of its underlying theory. The 
first section develops the general geometric and algebraic properties that form the 
foundation of the simplex method. We then describe the matrix form of the simplex 
method (called the revised simplex method), which streamlines the procedure consid- 
erably for computer implementation. Next, we present a fundamental insight about a 
property of the simplex method that enables us to deduce how changes that are made 
in the original model get carried along to the final simplex tableau. This insight will 
provide the key to the important topics of Chap. 6 (duality theory and sensitivity 
analysis). 


5.1 Foundations of the Simplex Method 


Section 4.1 introduced corner-point feasible solutions and the key role they play in 
the simplex method. These geometric concepts were related to the algebra of the 
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x, =0 
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(0, 0) X2 =0 


(4,0) (6,0) 


Figure 5.1 Constraint boundaries, constraint boundary equations, and corner-point solutions for the 
Wyndor Glass Co. problem. 


simplex method in Sec. 4.2. However, all of this was done in the context of the 
Wyndor Glass Co. problem, which has only two variables and so has a straightforward 
geometric interpretation. How do these concepts generalize to higher dimensions when 
we deal with larger problems? We address this question in this section. 

We begin by introducing some basic terminology for any linear programming 
problem with n variables (before introducing slack and artificial variables for initial- 
izing the simplex method). While we are doing this, you should find it helpful to refer 
to Fig. 5.1 to interpret these definitions in two dimensions (n = 2). 


Terminology 


It may seem intuitive that optimal solutions for any linear programming problem must 
lie on the boundary of the feasible region, and this is in fact a general property. 
Because boundary is a geometric concept, our initial definitions clarify how the bound- 
ary of the feasible region is identified algebraically. 


The constraint boundary equation for any constraint is obtained by replacing its =, 
=, or = sign by an = sign. 


Consequently, the form of a constraint boundary equation is ax; + apx + +++ + 
4,X, = b; for functional constraints and x; = 0 for nonnegativity constraints. Each 
such equation defines a ‘‘flat’’ geometric shape (called a hyperplane) in n-dimensional 
space, analogous to the line in two-dimensional space and the plane in three-dimen- 
sional space. This hyperplane forms the constraint boundary for the corresponding 
constraint, because any points lying on one side of the constraint boundary violate 
the constraint whereas points on the constraint boundary satisfy the constraint. (Points 
on the other side also satisfy the constraint if it is an inequality constraint.) 
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For example, the Wyndor Glass Co. problem has five constraints (three func- 
tional constraints and two nonnegativity constraints), so it has the five constraint 
boundary equations shown in Fig. 5.1. Because n = 2, the hyperplanes defined by 
these constraint boundary equations are simply lines. Therefore, the constraint 
boundaries for the five constraints are the five lines shown in Fig. 5.1. 


The boundary of the feasible region contains just those feasible solutions that satisfy 
one or more of the constraint boundary equations. 


Geometrically, any point on the boundary of the feasible region lies on one or 
more of the hyperplanes defined by the respective constraint boundary equations. 
Thus, in Fig. 5.1, the boundary consists of the five darker line segments. 

Next, we give a general definition of corner-point feasible. solution in n-dimen- 
sional space. 


A corner-point feasible solution is a feasible solution that does not lie on any line 
segment! connecting two other feasible solutions. 


As this definition implies, a feasible solution that does lie on a line segment connecting 
two other feasible solutions is not a corner-point feasible solution. To illustrate when 
n = 2, consider Fig. 5.1. The point (2, 3) is not a corner-point feasible solution, 
because it lies on various such line segments, e.g., the line segment connecting 
(0, 3) and (4, 3). Similarly, (0, 3) is not a corner-point feasible solution, because it 
lies on the line segment connecting (0, 0) and (0, 6). However, (0, 0) is a corner- 
point feasible solution, because it is impossible to find two other feasible solutions 
that lie on completely opposite sides of (0, 0). (Try it.) 

When the number of decision variables n is greater than 2 or 3, this definition 
for corner-point feasible solution is not a very convenient one for identifying such 
solutions. Therefore, it will prove most helpful to interpret these solutions algebrai- 
cally. For the Wyndor Glass Co. example, each comer-point feasible solution in Fig. 
5.1 lies at the intersection of two (n = 2) constraint lines; i.e., it is the simultaneous 
solution of a system of two constraint boundary equations. This situation is summa- 
rized in Table 5.1, where defining equations refer to the constraint boundary equa- 
tions that yield (define) the indicated corner-point feasible solution. Similarly, for any 
linear programming problem, each comer-point feasible solution lies at the intersection 
of n constraint boundaries; i.e., it is the simultaneous solution of a system of n 
constraint boundary equations. 

However, this is not to say that every set of n constraint boundary equations 
chosen from among the (n + m) constraints (n nonnegativity and m functional con- 
straints) yields a corner-point feasible solution. In particular, the simultaneous solution 
of such a system of equations might violate one or more of the other m constraints 
not chosen, in which case it is a corner-point: infeasible. solution. The example has 
three such solutions, as summarized in Table 5.2. (Check to see why they are in- 
feasible.) 

Furthermore, a system of n constraint boundary equations might have no solution 
at all. This occurs twice in the example, with the pairs of equations (1) x, = 0 and 
x, = 4 and (2) x, = 0 and 2x, = 12. Such systems are of no interest to us. 


1 A formal definition of line segment is given in Appendix 1. 


Table 5.1 Defining Equations for 
Each Corner-Point Feasible Solution 
for Wyndor Glass Co. Problem 


Corner-Point Defining 








Feasible Solution Equations 

(0, 0) x= 0 
x, = 0 

(, 6) x= 0 
Ix, = 12 

(2, 6) 2x12 
3x, + 2x, = 18 

(4, 3) 3x, + 2x, = 18 
x, = 4 

(4, 0) x= 4 
x» = 0 


The final possibility (which never occurs in the example) is that a system of n 
constraint boundary equations has multiple solutions because of redundant equations. 
You need not be concerned with this case either, because the simplex method circum- 
vents its difficulties. 

To summarize for the example, with five constraints and two variables, there 
are 10 pairs of constraint boundary equations. Five of these pairs became defining 
equations for corner-point feasible solutions (Table 5.1), three became defining equa- 
tions for corner-poiht infeasible solutions (Table 5.2), and each of the final two pairs 
had no solution. 


Adjacent Corner-Point Feasible Solutions 


We now will focus on adjacent corner-point feasible solutions and their role in solving 
linear programming problems. Recall from Chap. 4 that, when ignoring slack and 
artificial variables, each iteration of the simplex method moves from the current cor- 
ner-point feasible solution to an adjacent one. What is the path followed in this 
process? What really is meant by adjacent corner-point feasible solution? We first 
address these questions from a geometric viewpoint, and then turn to algebraic inter- 
pretations. 


Table 5.2 Defining Equations for 
Each Corner-Point Infeasible Solution 
for Wyndor Glass Co. Problem 





Corner-Point Defining 
Infeasible Solution Equations 
— 

0, 9) x, = 0 

3x, + 2x, = 18 

(4, 6) 2x, = 12 

x, = 4 

(6, 0) 3x, + 2x, = 18 

x= 0 





sn a a (Ma 
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Figure 5.2 Feasible 
region and corner- 
point feasible 
solutions for a three- 
variable linear 
programming 
problem. 


These questions are easy to answer when n = 2. In this case, the boundary of 
the feasible region consists of several connected line segments forming a polygon, as 
shown in Fig. 5.1 by the five darker line segments. These line segments are referred 
to as edges of the feasible region. Emanating out of each corner-point feasible solution 
are two such edges leading to an adjacent corner-point feasible solution at the other 
end. (Note in Fig. 5.1 how each corner-point feasible solution has two adjacent ones.) 
The path followed in an iteration is to move along one of these edges from one end 
to the other. In Fig. 5.1, the first iteration involves moving along the edge from 
(0, 0) to (0, 6), and then the next iteration moves along the edge from (0, 6) to 
(2, 6). As Table 5.1 illustrates, each of these moves to an adjacent corner-point feasi- 
ble solution involves just one change in the set of defining equations (constraint bound- 
aries on which the solution lies). 

When = 3, the answers are slightly more complicated. To help you visualize 
what is going on, Fig. 5.2 shows a three-dimensional drawing of a typical feasible 
region when n = 3, where the dots are the corner-point feasible solutions. This 
feasible region is a polyhedron rather than the polygon we had with n = 2 (Fig. 5.1), 
because the constraint boundaries now are planes rather than lines. The faces of the 
polyhedron form the boundary of the feasible region, where each face is the portion 
of a constraint boundary that satisfies the other constraints as well. Note that each 
corner-point feasible solution lies at the intersection of three constraint boundaries 
(perhaps including some of the x, = 0, x, = 0, and x, = O constraint boundaries 
for the nonnegativity constraints), and the solution also satisfies the other constraints. 
Such intersections that don’t satisfy one or more of the other constraints yield corner- 
point infeasible solutions instead. 

The darker line segment in Fig. 5.2 depicts the path of the simplex method on 
a typical iteration. The point (2, 4, 3) is the current corner-point feasible solution to 
begin the iteration, and the point (4, 2, 4) will be the new corner-point feasible solution 
at the end of the iteration. The point (2, 4, 3) lies at the intersection of the x, = 4, 
xX, + x, = 6, and —x, + 2x, = 4 constraint boundaries, so these three equations 
are the defining equations for this corner-point feasible solution. If the x, = 4 defining 





Constraints 3 
Xi 


X3 
xX + X (4, 0. 4) 
=x; + 2x; 


MIA IA IA IA 
COROAR 


(0, 4, 2) 


X2 


equation were removed, the intersection of the other two constraint boundaries (planes) 
would form a line. One segment of this line, shown as the dark line segment from 
(2, 4, 3) to (4, 2, 4) in Fig. 5.2, lies on the boundary of the feasible region, whereas 
the rest of the line is infeasible. This line segment is called an edge of the feasible 
region, and its endpoints, (2, 4, 3) and (4, 2, 4), are adjacent corner-point feasible 
solutions. 

For n = 3, all of the edges of the feasible region are formed in this way as the 
feasible segment of the line lying at the intersection of two constraint boundaries, and 
the two endpoints of an edge are adjacent corner-point feasible solutions. In Fig. 5.2 
there are 15 edges of the feasible region, and so there are 15 pairs of adjacent corner- 
point feasible solutions. For the current corner-point feasible solution (2, 4, 3) there 
are three ways to remove one of its three defining equations to obtain an intersection 
of the other two constraint boundaries, so there are three edges emanating out of 
(2, 4, 3). These edges lead to (4, 2, 4), (0, 4, 2), and (2, 4, 0), so these are the 
comer-point feasible solutions that are adjacent to (2, 4, 3). 

For the next iteration, the simplex method chooses one of these three edges, 
say, the darker line segment in Fig. 5.2, and then moves along this edge away from 
(2, 4, 3) until it reaches the first new constraint boundary, x, = 4, at its other endpoint. 
[We cannot continue further along this line to the next constraint boundary, x; = 0, 
because this leads to a corner-point infeasible solution, (6, 0, 5).] The intersection of 
this first new constraint boundary with the two constraint boundaries forming the edge 
yields the new corner-point feasible solution (4, 2, 4). 

When n > 3, these same concepts generalize to higher dimensions, except the 
constraint boundaries now are hyperplanes instead of planes. Let us summarize. 


A corner-point feasible solution lies at the intersection of n constraint boundaries (and 
satisfies the other constraints as well). An edge of the feasible region is a feasible 

line segment that lies at the intersection of (n — 1) constraint boundaries, where each 
endpoint lies on one additional constraint boundary (so that these endpoints are corner- 
point feasible solutions). Two corner-point feasible solutions are adjacent if the line 
segment connecting them is an edge of the feasible region. Emanating out of each 
corner-point feasible solution are n such edges, each one leading to one of the n 
adjacent corner-point feasible solutions. Each iteration of the simplex method moves 
from the current corner-point feasible solution to an adjacent one by moving along one 
of these n edges. 


When you shift from a geometric viewpoint to an algebraic one, intersection of 


constraint boundaries changes to simultaneous solution of constraint boundary equa- 
tions. The n constraint boundary equations yielding (defining) a corner-point feasible 
solution are its defining equations, where deleting one of these equations yields a line 
whose feasible segment is an edge of the feasible region. 

We next analyze some key properties of corner-point feasible solutions, and 
then describe the implications of all of these concepts for interpreting the simplex 
method. However, while the above summary is fresh in your mind, let us give you a 
preview of its implications. When the simplex method chooses an entering basic 
variable, the geometric interpretation is that it is choosing one of the edges emanating 
out of the current corner-point feasible solution to move along. Increasing this variable 
from zero (and simultaneously changing the values of the other basic variables ac- 
cordingly) corresponds to moving along this edge. Having the first basic variable 
reach zero (the leaving basic variable) corresponds to reaching the first new constraint 
boundary at the other end of this edge of the feasible region. 
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Properties of Corner-Point Feasible Solutions 


In Sec. 4.1 we listed three key properties of corner-point feasible: solutions that con- 
stitute the underlying principles of the simplex method. We now are in a position to 
explain why these properties (restated here) do indeed hold for any linear programming 
problem that has feasible solutions and a bounded feasible region. 


Property 1:. (a) If there is exactly one optimal solution, then it must be a 
comer-point feasible solution. (b) If there are multiple optimal solutions, then 
at least two must be adjacent corner-point feasible solutions. 


Property 1 is a rather intuitive one from a geometric viewpoint. First consider 
case (a), which is illustrated by the Wyndor Glass Co. problem (see Fig. 3.3 or 5.1) 
where the one optimal solution (2, 6) is indeed a corner-point feasible solution. Note 
that there is nothing special about this example that led to this result. For any problem 
having just one optimal solution, it always is possible to keep raising the objective 
function line (hyperplane) until it just touches one point (the optimal solution) at a 
corner of the feasible region. 

The following algebraic viewpoint also. clarifies why the property must hold in 
case (a). We will construct a proof by contradiction by assuming that the one optimal 
solution is not a corner-point feasible solution and.then showing that this assumption 
leads to a contradiction and so cannot be true. The key step is to notice from the 
definition of corner-point feasible solution that this assumption implies that there must 
be two other feasible solutions such that the line segment connecting them contains 
the optimal solution. Let the vectors x*, x’, x” denote the optimal solution and these 
two other feasible solutions, respectively, and let Z*, Zi, Z, denote their respective 
objective function values. Like each other point on the line segment connecting x’ 
and x” (see Appendix 1), 


x* = ax” + (1 — a)x’ 
for some value of a such that 0 < a < 1. Thus 
Z* = aZ, + ene Ny 


Since the weights, a and (1 — a), add up to 1, the only possibilities for how Z*, Z,, 
and Z, compare are (1) Z* = Z, = Z,, (2) Z < Z* < Z, and (3) Z > Z* > Z. 
The first possibility implies that x’ and x” also are optimal, which contradicts the 
assumption that case (a) holds. Both the latter possibilities contradict the assumption 
that x* is optimal. The resulting conclusion is that it is impossible to have a single 
optimal solution that is not a corner-point feasible solution. 

Now consider case (b), which was demonstrated in Sec. 3.2 under the definition 
of optimal solution by changing the objective function in the example to Z = 3x, + 
2x». What then happens in the graphical solution procedure is that the objective 
function line keeps getting raised until it contains the line segment connecting the two 
corner-point feasible solutions (2, 6) and (4, 3). The same thing would happen in 
higher dimensions except that now it would be an objective function hyperplane that 
keeps getting raised until it contains the line segment(s) connecting two (or more) 
adjacent corner-point feasible. solutions. As a consequence, all optimal solutions can 
be obtained as weighted averages of optimal corner-point feasible solutions. (This 
situation is described further in Probs. 13. and 14 at the end of Chap. 4.) 


The real significance of Property 1 is that it greatly simplifies the search for an 
optimal solution because now only corner-point feasible solutions need be considered. 
The magnitude of this simplification is emphasized in Property 2. 


Property 2: There are only a finite number of corner-point feasible solutions. 


This property certainly holds in Figs. 5.1 and 5.2, where there are just five and 
ten cormer-point feasible solutions, respectively. To see why the number is finite in 
general, recall that each corner-point feasible solution is the simultaneous ‘solution of 
a system of n out of the (m + n) constraint boundary equations. The number of 
different combinations of (m + n) equations taken n at a time is 


(m + n)! 
mint 


which is a finite number. This number, in turn, is an upper bound on the number of 
corner-point feasible solutions. In Fig. 5.1, m = 3 and n = 2, so there are 10 different 
systems of two equations, but only half of them yield corner-point feasible solutions. 
In Fig. 5.2, m = 4 and n = 3, which gives 35 different systems of three equations, 
but only 10 yield corner-point feasible solutions. 

Property 2 suggests that an optimal solution can be obtained just by exhaustive 
enumeration; i.e., find and compare all the finite number of corner-point feasible 
solutions. Unfortunately, there are finite numbers, and then there are finite numbers 
that (for all practical purposes) might as well be infinite. For example, a rather small 
linear programming problem with only m = 50 and n = 50 would have (100!)/(50!)? 
=~ 10” systems of equations to be solved! By contrast, the simplex method would 
need to examine only approximately 100 corner-point feasible solutions for a problem 
of this size. This tremendous savings can be obtained because of the optimality test 
provided by Property 3: 


Property 3: If a corner-point feasible solution has no adjacent corner-point 
feasible solutions that are better (as measured by Z), then there are no better 
corner-point feasible solutions anywhere; i.e., it is optimal. 


To illustrate Property 3, consider Fig. 5.1 (or Fig. 3.3) for the Wyndor Glass 
Co. example. For the corner-point feasible solution (2, 6), its adjacent.corner-point 
feasible solutions are (0, 6) and (4, 3), and neither has a better value of Z than for 
(2, 6). This outcome implies that none of the other corner-point feasible solutions — 
(0, 0) and (4, 0)—can be better than (2, 6), so (2, 6) must be optimal. 

By contrast, Fig. 5.3 shows a feasible region that can never occur for a linear 
programming problem but that does violate Property 3. The problem shown is identical 
to the Wyndor Glass Co. example (including the same objective function) except for 
the enlargement of the feasible region to the right of (§, 5). Consequently, the adjacent 
corner-point feasible solutions for (2, 6) now are (0, 6) and (Š, 5), and again neither 
is better than (2, 6). However, another corner-point feasible solution (4, 5) now is 
better than (2, 6), thereby violating Property 3. The reason is that the boundary of 
the feasible region goes down from (2, 6) to ($, 5), and then ‘‘bends outward’’ to 
(4, 5), beyond the objective function line passing through (2, 6). 

The key point is that the kind of situation illustrated in Fig. 5.3 can never occur 
in linear programming. The feasible region in Fig. 5.3 implies that the 2x, = 12 and 
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Figure 5.3 
Modification of the 
Wyndor Glass Co. 
problem that 
violates both linear 
programming and 
Property 3 for 
comer-point feasible 
solutions in linear 
programming. 


XÅ 






Z = 36 = 3x, + 5x, 








(0, 0) 2 4 x, 


3x, + 2x, = 18 constraints apply for 0 = x, = $. However, under the condition that 
$ <x, <4, the 3x, + 2x, = 18 constraint is dropped and replaced by x, = 5. Such 
‘‘conditional constraints’’ just are not allowed in linear programming. 

The basic reason that Property 3 holds for any linear programming. problem is 
that the feasible region always has the property of being a convex set, as defined in 
Appendix 1 and illustrated in several figures there. For two-variable linear program- 
ming problems, this convex property means that the angle inside the feasible region 
at every corner-point feasible solution is less than 180°. This property is illustrated in 
Fig. 5.1, where the angles at (0, 0), (0, 6), and (4, 0) are 90° and those at (2, 6) and 
(4, 3) are between 90° and 180°. By contrast, the feasible region in Fig. 5.3 is not a 
convex set, because the angle: at (8, 5) is more than 180°. This is the kind of “*bending 
outward’’ at an angle greater than 180° that can never occur in linear programming. 
In higher dimensions, the same intuitive notion of ‘‘never bending outward’’ continues 
to apply. 

To clarify the significance of a convex. feasible region, consider the objective 
function hyperplane that. passes through a corner-point feasible solution that has no 
adjacent corner-point feasible solutions that are better. [In the original Wyndor Glass 
Co. example, this hyperplane is the line passing through (2, 6) in Fig. 3.3.] All of 
these adjacent solutions [(0, 6) and. (4, 3).in the example]. must either lie on the 
hyperplane or lie on the unfavorable side (as measured by Z) of the hyperplane. The 
feasible region being convex means that its boundary. cannot “‘bend outward’’ beyond 
an adjacent corner-point feasible solution to give another corner-point feasible solution 
that lies on the favorable side of the hyperplane, so Property 3 holds. 


Extensions to the Augmented Form of the Problem 


For any linear programming problem in our standard form (see Sec. 3.2), the ap- 
pearance of the functional constraints after slack. variables are introduced (see Sec. 
4.2) is as follows: 


(1) aX + aX FoF Ay Xy T Xpt = b, 


(2) aX; + anx Foo + AgyXy + Xn 42 = by 
(m) apai + amir +00 t + Gyn Xp + Xnim = Vm 
where X44, Xn +++ >Xp4m are the slack variables. For other linear programming 


problems, Sec. 4.6 described how this same appearance (proper form from Gaussian 
elimination) can be obtained by introducing artificial variables, etc. Thus the original 
solutions (x,, X2, ..., X,) now are augmented by the corresponding values of the 
slack or artificial variables (X41, X;42:- +» »X,4m)- This augmentation led in Sec. 
4.2 to defining basic solutions as augmented corner-point solutions, and basic fea- 
sible solutions as augmented corner-point feasible solutions. Consequently, the pre- 
ceding three properties of corner-point feasible solutions also hold for basic feasible 
solutions. 

Now let us clarify the algebraic relationships between basic solutions and corner- 
point solutions. Recall that each corner-point solution is the simultaneous solution of 
a system of n constraint boundary equations, which we called its defining equations. 
The key question is: ‘‘How do we tell whether a particular constraint boundary equa- 
tion is one of the defining equations when the problem is in augmented form?’’ The 
answer, fortunately, is a simple one. Since there now are (n + m) variables, one for 
each of the (n + m) nonnegativity and functional constraints, each constraint has 
exactly one indicating variable that completely indicates (by whether its value is 
zero) whether that constraint’s boundary equation is satisfied by the current solution. 
A summary appears in Table 5.3. 

Thus whenever a constraint boundary equation is one of the defining equations 
for a corner-point solution, its indicating variable has a value of zero in the augmented 
form of the problem. Each such indicating variable is called a nonbasic variable for 
the corresponding basic solution. The resulting conclusions and terminology (already 
introduced in Sec. 4.2) are summarized next. 


Each basic solution has n nonbasic variables set equal to zero. The values of the 
remaining m variables (called basic variables) are the simultaneous solution of 

the system of m equations for the problem in augmented form (after setting the 
nonbasic variables to zero). This basic solution is the augmented corner-point solution 
whose n defining equations are those indicated by the nonbasic variables. 


Now consider the basic feasible solutions. Note that the only requirements for 
a solution to be feasible in the augmented form of the problem are that it satisfy the 
system of equations and that all the variables be nonnegative. 


Table 5.3 Indicating Variables for Constraint Boundary 
Equations* 









Original Constraint Constraint Boundary Indicating 








j= 


(in Augmented Form) Equation Variable 
420(G = 1,2,...,n) x= 0 x; 
2 apg + Xapi = b; > ajx; = b; Xn+i 






j=l 





* Indicating variable = 0 > constraint boundary equation satisfied; 
indicating variable # 0 > constraint boundary equation violated. 
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Table 5.4 Indicating Variables for Constraint Boundary 
Equations of Wyndor Glass Co. Problem* 





Original Constraint Constraint Boundary Indicating 

(in Augmented Form) Equation Variable 
x, 20 x, = 0 X 
x, 20 x, = 0 X 
(1) x, +x,= 4 x= 4 X3 
(2) 2x, + x, = 12 2x, = 12 Xa 
(3).. 3x, + 2x, + x5 = 18 | 3x, + 2x, = 18 X5 








* Indicating variable = 0 > constraint boundary equation satisfied; 
indicating variable + 0 > constraint boundary equation violated. 


A basic feasible solution is a basic solution where all m basic variables are nonnegative 
(20). A basic feasible solution is said to be degenerate if any of these m variables 
equals zero. 


Thus it is possible for a variable to be zero and still not be a nonbasic variable for 
the current basic feasible solution: (This case corresponds to a corner-point feasible 
solution that satisfies another constraint boundary equation in addition to its n defining 
equations.) Therefore, it is necessary to keep track of which is the current set of 
nonbasic variables (or the current set of basic variables) rather than relying upon their 
zero values. 

We noted earlier that not every system of n constraint boundary equations yields 
a corner-point solution, either because the system has no solution or because it has 
multiple solutions. For analogous reasons, not every set of n nonbasic variables yields 
a basic solution. However, these cases are avoided by the simplex method. 

To illustrate these definitions, consider the Wyndor Glass Co. example once 
more. Its constraint boundary equations and indicating variables are shown in Table 
5.4. 

Augmenting each of the corner-point feasible solutions (see Table 5.1) yields 
the basic feasible solutions listed in Table 5.5. This table places adjacent basic feasible 
solutions next to each other, except for the pair consisting of the first and last solutions 
listed. Notice that in each case the nonbasic variables necessarily are the indicating 
variables for the defining equations. Thus adjacent basic feasible solutions differ by 


Table 5.5 Basic Feasible Solutions for Wyndor Glass Co. Problem 





Comer-Point Defining Basic Feasible Nonbasic 
Feasible Solution Equations Solution Variables 
(0, 0) = (0, 0, 4, 12, 18) Xi 
rS X 
(0, 6) = (0, 6, 4, 0, 6) x, 
aol X4 
(2, 6) = (2, 6, 2, 0, 0) X4 
Xs 
(4, 3) = (4, 3, 0, 6, 0) Xs 
x3 
(4, 0) = (4, 0, 0, 12, 6) X3 








Table 5.6 Basic Infeasible Solutions for Wyndor Glass Co. Problem 



















Nonbasic 
Variables 


Basic Infeasible 
Solution 


Comer-Point 
Infeasible Solution 


Defining 
Equations 






















(0, 9) x= (0, 9, 4, —6, 0) x 
3x, + 2x, = 18 Xs 
(4, 6) 2x, = 12 | (4, 6,0, 0, ~6) X4 
x= 4 X3 

(6, 0) 3x, + 2x, = 18 | (6,0, —2, 12, 0) 


X= 


having just one different nonbasic variable. Also notice that each basic feasible so- 
lution necessarily is the resulting simultaneous solution of the system of equations for 
the problem in augmented form (see Table 5.4) when the nonbasic variables are set 
equal to zero. 

Similarly, the other three corner-point solutions (see Table 5.2) yield the re- 
maining basic solutions shown in Table 5.6. 

The other two sets of nonbasic variables, (1) x, and x, and (2) x, and x4, do not 
yield a basic solution, because setting either pair of variables equal to zero leads to 
having no solution for the system of equations (1)—(3) given in Table 5.4. This 
conclusion parallels the observation we made early in this section that the correspond- 
ing sets of constraint boundary equations do not yield a solution. 

The simplex method starts at a basic feasible solution and then iteratively moves 
to a better adjacent basic solution until an optimal solution is reached. At each iter- 
ation, how is the adjacent basic feasible solution reached? 

For the original form of the problem, recall that an adjacent corner-point solution 
is reached from the current one by (1) deleting one constraint boundary (defining 
equation) from the set of n constraint boundaries defining the current solution, 
(2) moving away from the current solution in the feasible direction along the inter- 
section of the remaining (n — 1) constraint boundaries (an edge of the feasible region), 
and (3) stopping when the first new constraint boundary (defining equation) is reached. 

Equivalently, in our new terminology, the simplex method reaches an adjacent 
basic feasible solution from the current one by (1) deleting one variable (the entering 
basic variable) from the set of n nonbasic variables defining the current solution, 
(2) moving away from the current solution by increasing this one variable from zero 
(and adjusting the other basic variables to still satisfy the system of equations) while 
keeping the remaining (n — 1) nonbasic variables at zero, and (3) stopping when the 
first of the basic variables (the leaving basic variable) reaches a value of zero (its 
constraint boundary). With either interpretation, the choice among the n alternatives 
in step (1) is made by selecting the one that would give the best rate of improvement 
in Z (per unit increase in the entering basic variable) during step (2). 


5.2 The Revised Simplex Method 
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The simplex method as described in Chap. 4 (hereafter called the original simplex 
method) is a straightforward algebraic procedure. However, this way of executing the 


124 


Linear Programming 


algorithm (in either algebraic: or tabular form) -is not the most efficient computational 
procedure for computers because it computes and stores many numbers that are not 
needed at the current iteration and that may not even become relevant for decision 
making at subsequent iterations. The only pieces of information relevant at each 
iteration are the coefficients of the nonbasic variables in Eq. (0), the coefficients of 
the entering basic variable in the other equations, and the right-hand side of the 
equations. It would be very useful to have a procedure that could obtain this infor- 
mation efficiently without computing and storing all the other coefficients. 

As mentioned in Sec. 4.8, these considerations motivated the development of 
the revised simplex method. This method was designed to accomplish exactly the same 
things as the original simplex method, but in a way that is more efficient for execution 
on a computer. Thus it is a streamlined version of the original procedure. It computes 
and stores only the information that is currently needed, and it carries along the 
essential data in a more compact form. 

The revised simplex method explicitly. uses matrix manipulations, so it is nec- 
essary to describe the problem in matrix notation. (See Appendix 3 for a review of 
matrices.) To help. you distinguish between matrices, vectors, and scalars, we con- 
sistently use boldface CAPITAL letters to represent matrices, boldface lowercase 
letters to represent vectors, and italicized letters in ordinary print to represent scalars. 
We also use a boldface zero (0) to denote a null vector-(a vector whose elements all 
are zero) in either column or row.form (which one should be clear from the context), 
whereas a zero in ordinary print (0) continues to represent the number zero. 

Using matrices, our standard form for the general linear programming model 
given in Sec. 3.2 becomes 





Maximize Z = CX, 


subject to 


Ax =b and x= 0, 





where c is the row vector 
e = [ci ayes, 2 sxe]; 


x, b, and 0 are the column vectors such that 


xy b; 0 
Xo by 0 
x= : b=]: |, 0= p 
x, bn 0 
and A is the matrix 

Gy, Ay in 

an an Aan 
A= : 

Ami Am2 oe % Amn 


To obtain the augmented form of the problem, introduce the column vector of slack 
variables 
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so that the constraints become 


[A, I] ki =b and RI = 0, 


where I is the m X m identity matrix, and the null vector 0 now has (n + mj elements. 


Solving for a Basic Feasible Solution 


Recall that the general approach of the simplex method is to obtain a sequence of 
improving basic feasible solutions until an optimal solution is reached. One of the 
key features of the revised simplex method involves the way in which it solves for 
each new basic feasible solution after identifying its basic and nonbasic variables. 
Given these variables, the resulting basic solution is the solution of the m equations 


[A, I] =| = b, 


in which the n nonbasic variables from among the (n + m) elements of 


A 


are set equal to zero. Eliminating these n variables by equating them to zero leaves a 
set of m equations in m unknowns (the basic variables). This set of equations can be 
denoted by 


where the vector of basic variables, 


XBm 


and the basis matrix, 


m 
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is obtained by eliminating the columns corresponding to coefficients of nonbasic vari- 
ables from [A, I]. (In addition, the elements of x, and, therefore, the columns of B 
may be placed in a different order when executing the simplex method.) 

The simplex method introduces only basic variables such that B is nonsingular, 
so that B7! always will exist. Therefore, to solve Bx, = b, both sides would be 
premultiplied by B™!. 


B-'Bx, = B` 'b. 


Since B~'B = I, the desired solution for the basic variables is 
x, = B`'b. 


Let €p be the vector whose elements are the objective function coefficients (including 
zeroes for slack variables) for the corresponding elements of xz. The value of the 
objective function for this basic solution is then 


|z = €pXg = CgB'b. | 


EXAMPLE: To illustrate this method of solving for a basic feasible solution, consider 
again the Wyndor Glass Co. problem presented in Sec. 3.1 and solved by the original 
simplex method in Table 4.8. In this case, 








10100 4 7 X3 
c = [3,5], [A,0=]0 201 oj b=], x= HES xal. 
3 2 001 18 X; 


Referring to Table 4.8, the sequence of basic feasible solutions obtained by the simplex 
method (original or revised) is the following: 


Iteration 0 


ty 10 0 Xa 1 0 olj 4 4 
x, =|x,;, B=|]0 1 O|] =B!, sofxy} =]O 1 Off 12) =] 12], 
Ke 0 0 1 Xs 0 0 14} 18 18 
4 
c = [0,0,0], soZ = [0, 0, 0]| 12 | = 0. 
18 
Iteration | 
eo 1 0 0 1 0 0 
t= | Sele, Bey. 2d Beh 2 oj, 
Xs 0 2 I 0 =l l 
x 1 0 oO}; 4 4 
so ee S i) 12 p= | 64, 
Kg 0 =l 1 |] 18 6 
4 


Iteration 2 


X3 1 0 1 1 34 =} 
x =|ļ|x»|. B=]0 2 0 Bola ie z2 0|, 
xi 0 2 3 0 -3 3 
rr 1 3 ajj 4 2 
so yl=|0 > of 12] =|6 
i 0 -3 3 l| 18 2 


2 
cg = [0, 5. 3], so Z = [0, 5, 3]] 6 
2 


Matrix Form of the Current Set of Equations 


The last preliminary before summarizing the revised simplex method is to show the 
matrix form of the set of equations appearing in the simplex tableau for any iteration 
of the original simplex method. 

For the original set of equations, the matrix form is 


. -e i a 
0 A To] [by 
s X, ab” i 











This set of equations also is exhibited in the first simplex tableau of Table 5.7. 

The algebraic operations performed by the simplex method (multiply an equation 
by a constant and add a multiple of one equation to another equation) are expressed 
in matrix form by premultiplying both sides of the original set of equations by the 
appropriate matrix. This matrix would have the same elements as the identity matrix, 
except that each multiple for an algebraic operation would go into the spot needed to 
have the matrix multiplication perform this operation. Even after a series of algebraic 
operations over several iterations, we still can deduce what this matrix must be (sym- 
bolically) for the entire series by using what we already know about the right-hand 
side of the new set of equations. In particular, after any iteration. x, = B7'b and 
Z = c,B~'b, so the right-hand side of the new set of equations has become 


Table 5.7 Initial and Later Simplex Tableaux in Matrix Form 
































Basic i C oefficient of Right 
Iteration Variable : 4 Original Variables Slack Variables Side 
Z 0 
0 
Xp b 
Z 'b 
Any 
Xp l- m 0 B'A B`! B`'b 








Simplex Method { 
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Because we perform the same series of algebraic operations on both sides of 
the original set of operations, we use this same matrix premultiplying the original 
right-hand side to premultiply the original left-hand side. Consequently, since 


i eB || -c ay _ |i 
0 B' |O A I| Jo 


the desired matrix form of the set of equations after any iteration is 


c,B A -c cB! 
B'A BOD 














The second simplex tableau of Table 5.7 also exhibits this same set of equations. 


EXAMPLE: To illustrate this matrix form for the current set of equations, consider 
the final set of equations resulting from iteration 2 for the Wyndor Glass Co. problem. 
Using the B~! given for iteration 2, 


1 4 -4]f1 0 0 O 
BUA=!10 4 otfo 2}/=]fo ıl, 
0 = s]/3 2 1 
1 4 -3% 
t,B += [0,5,30 2 0} = [0,5,1], 
0 -3 3 
0 o0 
cBT'A — ce = [0,5,3]/0 1] — [3,5] = [0, 0}. 
1 0 


Also, using the values of x, = B~ 'b and Z = cB™'b calculated a few pages back, 
these results give the following set of equations: 





Z 
l o Z 0]o0o ë iıijlx 36 
olo of 1 4 -k}fxa] _ | 2 
olo ılo 4 ojla] Loe 
ofli 010 -4 44) x, 2 
Xs 


as shown in the final simplex tableau in Table 4.8. 


The Overall Procedure 


There are two key implications from the matrix form of the current set of equations 
shown at the bottom of Table 5.7. The first is that only B~' needs to be derived to 
be able to calculate all the numbers in the simplex tableau from the original parameters 


(A, b, cp) of the problem. (This implication is the essence of the fundamental insight 
described in the next section.) The second is that any one of these numbers (except 
Z = c,B7'b) can be obtained by performing only part of a matrix multiplication. 
Therefore, the required numbers to perform an iteration of the simplex method can 
be obtained as needed without expending the computational effort to obtain all the 
numbers. These two key implications are incorporated into the following summary of 
the overall procedure. 


Summary of Revised Simplex Method 


1. Initialization step: Same as for original simplex method. 
2. Iterative step: 

Part I Determine the entering basic variable: Same as for original 
simplex method. 

Part 2 Determine the leaving basic variable: Same as for original 
simplex method, except calculate only the numbers required to do this [the 
coefficients of the entering basic variable in every equation but Eq. (0), and 
then, for each strictly positive coefficient, the right-hand side of that equa- 
tion]. 

Part 3 Determine the new basic feasible solution: Derive B~' and set 
Xg = B~'b. (Calculating x, is optional unless the optimality test finds it to 
be optimal.) 

3. Optimality test: Same as for original simplex method, except calculate only 
the numbers required to do this test, i.e., the coefficients of the nonbasic 
variables in Eq. (0). 


In part 3 of the iterative step, B7! could be derived each time by using a standard 
computer routine for inverting a matrix. However, since B (and therefore B~!) 
changes so little from one iteration to the next, it is much more efficient to derive the 
new B`! (denote it by B;.!) from the B~! at the preceding iteration (denote it by 

B54). (For the initial basic feasible solution, B = I = B~!.) The method for doing 
this derivation is based directly upon the interpretation of the elements of B~! (the 
coefficients of the slack variables in the current equations 1, 2, ... , m) presented 
in the next section, as well as upon the procedure used by the original simplex method 
to obtain the new set of equations from the preceding set. 

To describe this method formally, let 


X, = entering basic variable, 


a}, = coefficient of x, in current Eq. (i), fori = 1,2,...,m 


(calculated in part 2 of the iterative step), 
r = number of the equation containing the leaving basic 
variable. 


Recall that the new set of equations [excluding Eq. (0)] can be obtained from the 
ie set by subtracting aj,/a/, times Eq. (r) from Eq. (i), for all i = 
1,2; .,mexcepti =r, and then dividing Eq. (r) by ark: Therefore, the element 
in row i pond column j of Bio is 


nae 


' Because the value of x, is the entire vector of right-hand sides except for Eq. (0), the relevant right-hand 
sides need not be calculated here if x, was calculated in part 3 of the preceding iteration. 
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F 
a; 
=f ik =f el 
(Baa) — > Baa), ifi*r 
Ark 
BZ); = 
new/ ij 1 
= Bia) ifi=r. 
rk 


These formulas are expressed in matrix notation as 
Bz} = EB; 
new old > 


where the matrix E is an identity matrix except that its rth column is replaced by the 
vector 


Gig 
nı -& jfi#r 
M ark 
n=]: where n; = 
; ror ifi=r. 
Nyy a if i 
Thus E = [U,, U,,...,U,_,, 9, U,.,,..., U,,], where the m elements of each 


of the U, column vectors are 0 except for a 1 in the ith position. 


EXAMPLE: We shall illustrate the revised simplex method by applying it to the 
Wyndor Glass Co. problem. The initial basic variables are the slack variables. 


X3 
Xp = X4 . 


Xs 


Iteration 1: Because the initial B7! = I, no calculations are needed to obtain the 


numbers required to identify the entering basic variable x, (~c, = -5 < -3 = 
—c,) and the leaving basic variable x, (ay. = 0, ba/an = ¥ < f = by/ay, so 
r = 2). Thus the new set of basic variables is 
X3 
Xp = | Xo |. 
x5 
To obtain the new B7}, 
42 
an 
0 
| 
n=] — |=] 2b 
a22 ~1 
_ 432 
az 
1 0 O71 0 1 0 0 
so B'=|0 3 ojlo 1 Of=]0 4 0), 
0 -i 1 {| 0 1 0 =] 1 


X3 1 0 oj 4 4 131 
so that x)= ]O 3 Of, 12] =] 6]. The Theory of the 
X5 0 =l 1 |] 18 6 Simplex Method 


To test whether this solution is optimal, we calculate the coefficients of the 
nonbasic variables (x, and x,) in Eq. (0). Performing only the relevant parts of the 
matrix multiplications, 





1 0o olf1 = 
c,B-'A —c = [0,5,0]}0 4 olļjo — | -B, -] = [-3, -], 
o -1 14/3 = 
— 0 tes 
c,B~! = [0, 5, 0] 3 =[-,% -], 
ae —] acts 


so the coefficients of x, and x, are —3 and 3, respectively. Since x, has a negative 
coefficient, this solution is not optimal. 


Iteration 2: Using these coefficients of the nonbasic variables, the next iteration begins 
by identifying x, as the entering basic variable. To determine the leaving basic vari- 
able, we must calculate the other coefficients of x,: 


1 0o oļi - pios 
B—A=j|0 3 OO -=-ļj=j0o - 
o-1 143 -i B = 


Using the right-side column for the current basic feasible solution (the value of x,) 
just given for iteration 1, the ratios 4/1 > 6/3 indicate that x, is the leaving basic 
variable, so the new set of basic variables is 


a 
a5, 
vA a -4 
Xp = |X], with n = | -| = 0}. 
x, a3) 4 
1 
a3) 
Therefore, the new BT! is 
1 0 -3i/1 0 0 1 å —ł 
B¢!=]ļ]0 1 01; 0 4 0; =1]0 4 0 |, 
0 0 s1,0 -1 1 0 -4 4 
x3 1 4 -l4 2 
so that m}1=|0 4 O}f/12)/= 16 
x o -4 4ff18 2 


Applying the optimality test, we find that the coefficients of the nonbasic variables 
(x, and x5) in Eq. (0) are 


c,B~! = [0, 5, 3]] — 


Cole bol cope 
om © wm 
lI 
pmi 
l 
woco 
pi 
e.: 
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Because both coefficients ( and 1) are nonnegative, the current solution (x, = 2, x5 
= 6, xX; = 2, x, = 0, x; = 0) is optimal and the procedure terminates. 





General Observations 


Although the preceding pages describe the essence of the revised simplex method, we 
should point out that minor modifications may be made to improve the efficiency of 
its execution on computers. For example, B~! may be obtained as the product of the 
previous E matrices. This modification requires storing only the 7 column of E and 
the number of the column, rather than the B~! matrix, at each iteration. If magnetic 
tape must be used rather than core storage, this “‘product form’’ of the basis inverse 
may be the most efficient. 

You should also note that the preceding discussion was limited to the case of 
linear programming problems fitting our standard form given in Sec. 3.2. However, 
the modifications for other forms are relatively straightforward. The initialization step 
would be conducted just as it would for the original simplex method (see Sec. 4.6). 
When this step involves introducing artificial variables to obtain an initial basic feasible 
solution (and thereby to obtain an identity. matrix as the initial basis matrix), these 
variables would be included among the m elements of Xy 

Let us summarize the advantages of the revised simplex method over the original 
simplex method. One advantage is that the number of arithmetic computations may 
be reduced. This is especially true when the A matrix contains a large number of zero 
elements (which is usually the case for the large problems arising in practice). The 
amount of information that must be stored at each iteration is less, sometimes con- 
siderably. so. The revised simplex method also permits the control of the round-off 
errors inevitably generated by computers. This control can be exercised by periodically 
obtaining the current B~! by directly inverting B. Furthermore, some of the post- 
optimality problems discussed in Sec. 4.7 can be handled more conveniently with the 
revised simplex method. For all these reasons, the revised simplex method is usually 
preferable to the original simplex method for computer execution. 


5.3 A Fundamental Insight 


We shall now focus on a property of the simplex method (in any form) that has been 
revealed by the revised simplex method in the preceding section. This fundamental 
insight provides the key to both duality theory and sensitivity analysis (Chap. 6), two 
very important parts of linear programming. 

The insight involves the coefficients of the slack variables and the information 
they give. It is a direct result of the initialization step, where the ith slack variable 
(x, 4, is given a coefficient of +1 in Eq. (i) and a coefficient of zero in every other 
equation [including Eq. (0)] fori = 1, 2,...,m, as shown by the null vector 0 
and the identity matrix I in the Slack Variables column for iteration 0 in Table 5.7.! 
The other key factor is that subsequent iterations change the initial equations only by: 


1. Multiplying (or dividing) an entire equation by a nonzero constant; or 
1 Throughout most of this section, we assume that the problem is in our standard form, with b; = 0 for 


alli = 1,2,..., m, so that no additional adjustments are needed in the initialization step. We then adapt 
our conclusions to nonstandard forms late in the section. 


2. Adding (or subtracting) a multiple of one entire equation to another entire 
equation. 


As already described in the preceding section, a sequence of these kinds of 
algebraic operations is equivalent to premultiplying the initial simplex tableau by some 
matrix. (See Appendix 3 for a review of matrices.) The consequence can be sum- 
marized as follows. 


Verbal Description of Fundamental Insight: After any iteration, the coefficients of the 
slack variables in each equation immediately reveal how that equation has been 
obtained from the initial equations. 


As one example of the importance of this insight, recall from Table 5.7 that the 
matrix formula for the optimal solution obtained by the simplex method is 


Xg = Bo'b, 


where Xp is the vector of basic variables, B`! is the matrix of coefficients of slack 
variables for rows 1—m of the final tableau, and b is the vector of original right-hand 
sides (resource availabilities). (We soon will denote this particular B! by S*.) Post- 
optimality analysis normally includes an investigation of possible changes in b. By 
using this formula, you can see exactly how the optimal basic feasible solution changes 
(or whether it becomes infeasible because of negative variables) as a function of b. 
You do not have to reapply the simplex method over and over again for each new b, 
because the coefficients of the slack variables tell all! In a similar fashion, this fun- 
damental insight provides a tremendous computational saving for the rest of sensitivity 
analysis as well. 

To spell out the how and the why of this insight, let us look again at the Wyndor 
Glass Co. example. (Your OR COURSEWARE also includes another demonstration 
example.) 


EXAMPLE: ‘Table 5.8 shows the relevant portion of the simplex tableaux for dem- 
onstrating this fundamental insight. Darker lines have been drawn around the coeffi- 
cients of the slack variables in the second and third tableaux because these are the 


Table 5.8 Simplex Tableaux without Leftmost 
Columns for Wyndor Glass Co. Problem 









































Coefficient of Right 
Iteration xy X% %3 X4 x; | Side 
23° RS 0 0 0 0 
0 1 0 1 0 0 4 
0 2} 0 1 0 12 
3 2 0 0 1 18 
-3 0 
j 1 Oo] 1 0 0 
0 1] 0 3 0 
[3j 0] 0 -1 1 6 
0 36 
J ofı 3 ~3 2 
1} 0 2 0 6 
olo č —ż 4 2 
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crucial coefficients for applying the insight.. To. avoid clutter, we then identify the 
pivot row and pivot column by a single box around the pivot number only. 


Iteration 1: To demonstrate the fundamental insight, our focus is on the algebraic 
operations performed by the simplex method while using Gaussian. elimination to 
obtain the new basic feasible solution. If we divide the pivot row by the pivot number 
last rather than first, then the algebraic operations spelled out in Chap. 4 for iteration 
1 are 


New row 0 = old row 0 + (8) old row 2, 
New row | old row 1 + (0) old row 2, 
New row 2 = (3) old row 2, 
New row 3 = old row 3 + (—1) old row 2. 


Ignoring row O for the moment, these algebraic operations amount to premultiplying 
rows 1-3 of the initial tableau by the first matrix shown below. 


1 0 oji oO} 1 0 O} 4 
Newrows1-3=]0 2 O|/0 2:0 1 OO} 12 
0 -1 14/3 210 O 1 | 18 
1 0 [|1 0 0ļ4 
=/0 1 jo ¢ 0 f6lj 
3 0 LO -1 1 | 6 


Note how this first matrix is reproduced exactly as the coefficients of the slack vari- 
ables in rows 1—3 of the new tableau, because the coefficients of the slack variables 
in rows 1-3 of the initial tableau form an identity matrix. Thus, just as stated in the 
verbal description of the fundamental insight, the coefficients of the slack variables 
in the new tableau do indeed provide a record of the algebraic operations performed. 

This insight is not much to get excited about after just one iteration, since you 
can readily see from the initial tableau what the algebraic operations had to be, but it 
becomes invaluable after all of the iterations are completed. 

For row 0, the algebraic operation performed amounts to the following matrix 
calculations, where now our focus is on the vector [0, 3, 0] that premultiplies rows 
1-3 of the initial tableau. 


10/1 0 014 
New row 0 = [-3, —5/0, 0, 0/0} + [0, $ ojo 210 1 0112 
3 210 0 1118 


= 1-3, 0, fo 3 o], 30. 


Note how this vector is reproduced exactly as the coefficients of the slack variables 
in row 0 of the new tableau, just as was claimed in the statement of the fundamental 
insight. (Once again, the reason is the identity matrix for the coefficients of the slack 
variables in rows 1-3 of the initial tableau, along with the zeroes for these coefficients 
in row 0 of the initial tableau.) 


Iteration 2: The algebraic operations performed on the second tableau of Table 5.8 
for iteration 2 are 


New row 0 = oldrow0 + = (1) old row 3, 
New row 1 = old row 1 + (—$) old row 3, 


New row 2 = old row 2 + © (0) old row 3, 135 


New row 3 = (3) old row 3. The Theory of the 
Ignoring row 0 for the moment, these operations amount to premultiplying rows 1-3 Simplex Method 
of this tableau by the matrix 
1 0 -3 
0 1 0 |. 
0 0 3 


Writing this second tableau as the matrix product shown for iteration 1 (namely, the 
corresponding matrix times rows 1-3.of the initial tableau) then yields 


1 0 -#}]/1 o offi oj 1 0 0}; 4 
Final rows 1-3 = |0 1 ojjo #$ Off0 2; 0 1 0 j12 
0 oO |o -1 1}13 20 0 118 
1 $ -3]/1 0/1 0 0] 4 
o z Offo 2/0 1 0412 
0 -3 #3f/3 210 0 1 |18 

0 0 4 2 

=|0 1 3 6 |. 
1 0 -4 2 





The first two matrices shown on the first line of these calculations summarize the 
algebraic operations of the second and first iterations, respectively. Their product, 
shown as the first matrix on the second line, then combines the algebraic operations 
of the two iterations. Note how this matrix is reproduced exactly as the coefficients 
of the slack variables in rows 1-3 of the new (final) tableau shown on the third line. 
What this portion of the tableau reveals is how the entire final tableau (except row 0) 
has been obtained from the initial tableau, namely, 


Final row 1 = (1) initial row 1 + ($) initial row 2 + (—4) initial row 3, 
Final row 2 = (0) initial row 1 + (3) initial row 2 + (0) initial row 3, 
Final row 3 = (0) initial row 1 + (—4) initial row 2 + ($) initial row 3. 


To see why these multipliers of the initial rows are correct, you would have to 
trace through all of the algebraic operations of both iterations. For example, why does 
final row 1 include ($) initial row 2, even though a multiple of row 2 has never been 
added directly to row 1? The reason is that initial row 2 was subtracted from initial 
row 3 in iteration 1, and then (4) old row 3 was subtracted from old row 1 in iter- 
ation 2. 

However, there is no need for you to trace through. Even when the simplex 
method has gone through hundreds or thousands of iterations, the coefficients of the 
slack variables in the final tableau would reveal how this tableau has been obtained 
from the initial tableau. Furthermore, the same algebraic operations would give these 
same coefficients even if the values of some of the parameters in the original model 
(initial tableau) were changed, so these coefficients also reveal how the rest of the 
final tableau changes with changes in the initial tableau. 

To complete this story for row 0, the fundamental insight reveals that the entire 
final row 0 can be calculated from the initial tableau by using just the coefficients of 
the slack variables in the final row 0, [0, 3, 1]. This calculation is shown below, 
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where the first vector is row 0. of the initial tableau and the matrix is rows 1-3 of the 
initial tableau. 


1 0/1 0 0} 4 
Final row 0 = [-3, -5/0, 0, Of0] + [0, % gfo 2/0 1 O}12 
3 210 0 1/18 


=(0, o[o, 1], 36. 


Note again how the vector premultiplying rows 1-3 of the initial tableau is reproduced 
exactly as the coefficients of the slack variables in the final row 0. These quantities 
must be identical because of the coefficients of the slack variables in the initial tableau 
(an. identity matrix below a null vector). This conclusion is the row 0 part of the 
fundamental insight. 


Mathematical Summary 


Because its primary applications involve the final tableau, we shall now give a general 
mathematical expression for the fundamental insight just in terms of this tableau, 
using matrix notation. If you haven’t read Sec. 5.2, you now need to know that the 
parameters of the model are given by the matrix A = |q;|| and the vectors b = ld, 
and ¢ = |\c;||, as displayed at the beginning of that section. 

The only other notation needed is summarized and illustrated in Table 5.9. 
Notice how the vector t (representing row 0) and the matrix T (representing the other 
rows) together correspond to the rows of the initial tableau in Table 5.8, whereas the 
vector t* and matrix T* together correspond to the rows of the final tableau in Table 
5.8. This table also shows these vectors and matrices partitioned into three parts: The 
coefficients of the original variables, the coefficients of the slack variables (our focus), 
and the right-hand side. Once again, the notation distinguishes between parts of the 
initial tableau and the final tableau by using an asterisk only in the latter case. 


Table 5.9 General Notation for Initial and Final Simplex Tableaux in 
Matrix Form, Illustrated by Wyndor Glass Co. Problem 


Initial Tableau: 
Row 0: t= [—3, —5/0,0,0{ 0] = [-c:i 0/0]. 





1 011 

Other rows: T= {0 2{0 
3 210 

I 

| 

| 

1 


Combined: D 5 E 


Final Tableau: 
Row 0: t* = [0,0/ 0, $, 1 | 36) = [z* — ei y* į Z*). 


0 0i 1 3 -3 | 2 
Other rows: T* = | 0 110 z 0 | 6 | = [A* | S* | b*]. 
1 0; 0 -4 4 I2 
3 t* z* — c! ył | Z* 
Combined: ol = | A* i S+ | a 


For the coefficients of the slack variables (the middle part) in the initial tableau 
of Table 5.9, notice the null vector 0 in row O and the identity matrix I below, which 
provide the keys for the fundamental insight. The vector and matrix in the same 
location of the final tableau, y* and S*, then play a prominent role in the equations 
for the fundamental insight. (This matrix was denoted by B~! for any tableau in Sec. 
5.2, but we now are letting S* denote this particular matrix for just the final tableau, 
where S$ stands for Slack variable coefficients.) A and b in the initial tableau turn into 
A* and b* in the final tableau. For row 0 of the final tableau, the coefficients of the 
original variables are z* — c (so the vector z* is what has been added to the vector 
of initial coefficients, —c), and the right-hand side Z* denotes the optimal value 
of Z. 

Now suppose that you are given the initial tableau, t and T, and just y* and S* 
from the final tableau. How can this information alone be used to calculate the rest 
of the final tableau? The answer is provided by Table 5.7. This table includes some 
information that is not directly relevant to our current discussion, namely, how y* 
and S* themselves can be calculated (y* = ¢,B~! and S* = B~!) by knowing the 
current set of basic variables and so the current basis matrix B. However, the lower 
part of this table (which can represent either an intermediate or final simplex tableau) 
also shows how the rest of the tableau can be obtained from the coefficients of the 
slack variables, which is summarized as follows. l 


Fundamental Insight 


1. t* = t + y*T = [y*A — ci y* i yřb]. 
2. T* = S*T = [S*A | S* | S*b]. 


We already used these two equations when dealing with iteration 2 for the 
Wyndor Glass Co. problem in the preceding subsection. In particular, the right-hand 
side of the expression for final row O for iteration 2 is just t* + y*T, and the second 
line of the expression for final rows 1-3 is just S*T. 

Now let us summarize the mathematical logic behind the ‘two equations for the 
fundamental insight. To derive equation 2, recall that the entire sequence of algebraic 
operations performed by the simplex method (excluding those involving row 0) is 
equivalent to premultiplying T by some matrix, call it M. Therefore, 


T* = MT, 


but now we need to identify M. Writing out the component parts of T and T*, this 
T* = MT equation becomes 


[A* | S* | b*] = M [A į I į b] 
= [MA | M į Mb] 


Because the middle (or any other) component of these equal matrices must be the 
same, it follows that M = S*, so equation 2 is a valid equation. 

Equation 1 is derived in a similar fashion by noting that the entire sequence of 
algebraic operations involving row 0 amounts to adding some linear combination of 
the rows in T to t, which is equivalent to adding to t some vector times T. Denoting 
this vector by v, we thereby have 


t® = t + vT, 
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but v- still needs to be identified. Writing out the component parts of t and t* yields 
[z* —c | y* | Z*] = [~e 0/0] + v[Ai Ti b] 
= [~c + vAi vi vb]. 


Equating the middle component of these equal vectors gives v = y*, which validates 
equation 1. 

Thus far, the fundamental insight has been described under the assumption that 
the original model is in our standard form described in Sec. 3.2. However, the above 
mathematical logic now reveals just what adjustments are needed for other forms of 
the original model. The key is the identity matrix I in the initial tableau, which turns 
into S* in the final tableau. If some artificial variables must be introduced into the 
initial tableau to serve as initial basic variables, then it is the set of columns (appro- 
priately. ordered) for all of the initial basic variables (both slack and artificial) that 
form I in this tableau. The same columns in the final tableau provide S* for the 
T* = S*T equation and y* for the t* = t + y*T equation. If Big M’s were introduced 
into the preliminary row 0 as coefficients for artificial variables, then the t for the 
t* = t + y*T equation is the row 0 for the initial tableau after algebraically elimi- 
nating these nonzero coefficients for basic variables. (Alternatively, the preliminary 
row 0 can be used for t, but then these M’s must be subtracted from the final row 0 
to give y*.) (See Prob. 35.) 


Applications 


The fundamental insight has a variety of important applications in linear programming. 
One of these applications involves the revised: simplex method. As described in the 
preceding section (see Table 5.7), this method used S* = B7! and the initial tableau 
to calculate all the relevant numbers in the current tableau for every iteration. It goes 
even further than the fundamental insight by using B7! to calculate y* itself as y* = 


B~}. 
Another application involves the -interpretation of the shadow prices 
(yt, y3, - - - » Ye) described in Sec. 4.7. The fundamental insight reveals that Z* (the 


value of Z for the optimal solution) is 
Z* = y*b = © yb, 
i=1 


so, for example, 


Z* 


0b, + $b, + b, 


for the Wyndor Glass Co. problem. This equation immediately yields the interpretation 
for the y* given in Sec. 4.7. 

Another group of extremely important applications involves various post-opti- 
mality tasks (reoptimization technique, sensitivity analysis, parametric linear program- 
ming—described:-in Sec. 4.7) that involve investigating the effect of making one or 
more changes in the original model. In particular, suppose that the simplex method 
already has been applied to obtain an optimal solution (as well as y* and S*) for the 
original model, and then these changes are made. If exactly the same sequence of 
algebraic operations were to be applied to the revised initial tableau, what would be 


the resulting changes in the final tableau? Because y* and S* don’t change, the 
fundamental insight reveals the answer immediately. 

For example, consider the change from b, = 12 to b, = 13 as illustrated in 
Fig. 4.3 for the Wyndor Glass Co. problem. It isn’t necessary to solve for the new 
optimal solution (x,, x2) = (3. 4) because the values of the basic variables in the 
final tableau (b*) are immediately revealed by the fundamental insight: 


X3 1 3 -3 4 $ 
x | = b* = S*b = | 0 4 olli] =] #¥ 
a 0 -4 3]/ 18 3 


There is an even easier way to make this calculation. Since the only change is in the 
second component of b, which gets premultiplied by only the second column of S*, 
the change in b* can be calculated as simply 





Ab* = Ab, = 


| 


Cole DS Cole 


Col DI Cole 


so the original values of the basic variables in the final tableau (x, = 2, x. = 6, 
x, = 2) now become 


1 7 

X3 2 3 3 
13 

1 5 

x) 2 73 3 


(If any of these new values were negative, and thus infeasible, then the reoptimization 
technique described in Sec. 4.7 would be applied, starting from this revised final 
tableau.) Applying incremental analysis to the preceding equation for Z* also im- 
mediately yields 





AZ = AZ* = $3 Ab, = 3. 


The fundamental insight can be applied to investigating other kinds of changes 
in the original model in a very similar fashion; it is the crux of the sensitivity analysis 
procedure described in the latter part of Chap. 6. 

You also will see in the next chapter that the fundamental insight plays a key 
role in the very useful duality theory for linear programming. 


5.4 Conclusions 


Although the simplex method is an algebraic procedure, it is based on some fairly 
simple geometric concepts. These concepts enable the algorithm to examine only a 
relatively small number of basic feasible solutions before reaching and identifying an 
optimal solution. 

The revised simplex method provides an effective way of adapting the simplex 
method for computer implementation. 

The final simplex tableau includes complete information on how it can be al- 
gebraically reconstructed directly from the initial simplex tableau. This fundamental 
insight has some very important applications, especially for post-optimality analysis. 
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PROBLEMS 
Consider the following problem. 
Maximize Z = 3x, + 2x, 
2x, + xX, 6 
x, + 2x, = 6 
x, 20, xX, 20. 


Solve this problem graphically. Identify the corner-point feasible solutions by cir- 
cling them on the graph. 

Identify all the sets of two defining equations for this problem. For each one, solve 
Gf a solution exists) for the corresponding corner-point solution, and classify it as 
a comer-point feasible solution or corner-point infeasible solution. 

Introduce slack variables in order to’ write the functional constraints in augmented 
form. 

For each set of defining equations from part (b), identify the indicating variable for 
each equation, display the equations from part (c) after deleting these indicating 
(nonbasic) variables, and give the resulting basic solution. 

Without executing the simplex method, use its geometric interpretation (and the 
objective function) to identify the path (sequence of corner-point feasible solutions) 
it would follow to reach the optimal solution. For each of these comer-point feasible 
solutions in turn, rewrite the corresponding information from part (d) and then 
identify the following decisions being made for the next iteration: (7) which’ defining 
equation is being deleted and which is being added; (ii) which indicating variable 
is being deleted (the entering basic variable) and which is being added (the leaving 
basic variable). -— 


2. Follow the instructions of Prob. 1 for the linear programming model in Prob. 4 of 


3. Consider the following problem. 


5. 
6. Solow, 
1.* 
subject to 
and 
(a) 
(b) 
(c) 
(d) 
(e) 
Chap. 3. 
subject to 
and 


Maximize Z = 2x, + 3x3, 
—3x, + x= 1 
4x, + 2x, = 20 
4x, — X= 10 
—x, + 2x,= 5 


x, 20, x, = 0. 


(a) 
(b) 


(c) 


Solve this problem graphically. Identify the corner-point feasible solutions by cir- 
cling them on the graph. 

Develop a table giving each of the corner-point feasible solutions and the corre- 
sponding defining equations, basic feasible solution, and nonbasic variables. Cal- 
culate Z for each of these solutions and use just this information to identify the 
optimal solution. 

Develop the corresponding table for the corner-point infeasible solutions, and so on. 
Also identify the sets of defining equations and nonbasic variables that do not yield 
a solution. 


4. Consider the following problem. 


subject to 


and 


Maximize Z = 2x, — x, + X3, 
3x, +x + x = 60 
X, — X, + 2x, = 10 


X, +X, — x, = 20 


x, 20, x, = 0, x,20. 


After introducing slack variables and then performing one complete iteration of the simplex 
method, the following simplex tableau is obtained. 














Basic : Coefficient of Right 
Iteration Variable Side 
Z 20 
1 X4 30 
xX, 10 
X6 10 
(a) Identify the corner-point feasible solution obtained at iteration 1. 
(b) Identify the constraint boundary equations that define this corner-point feasible 


solution. 


5. Consider the three-variable linear programming problem shown in Fig. 5.2. 


(a) 


(b) 
(c) 


Construct a table like Table 5.1 giving the set of defining equations for each corner- 
point feasible solution. 

What are the defining equations for the corner-point infeasible solution (6, 0, 5)? 
Identify one of the systems of three constraint boundary equations that yield neither 
a corner-point feasible solution nor a corner-point infeasible solution. Explain why 
this occurs for this system. 


6. Consider the linear programming problem given in Table 6.1 as the dual problem 
for the Wyndor Glass Co. example. 


(a) 


(b) 


Identify the 10 sets of defining equations for this problem. For each one, solve (if 
a solution exists) for the corresponding corner-point solution, and classify it as a 
corner-point feasible solution or corner-point infeasible solution. 

For each corner-point solution, give the corresponding basic solution and its set of 
nonbasic variables. (Compare with Table 6.9.) 
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7. Consider the following problem. 


Minimize Z= x, + 2x, 


subject to -xy +x, S15 
2x, + x, = 90 
xX, 2 30 

and x, 20, x, 20. 


(a) Solve this problem graphically. 
(b) Develop a table giving each of the corner-point feasible solutions and the corre- 
sponding defining equations, basic feasible solution, and nonbasic variables. 


8. Reconsider Prob. 17 in Chap. 4. 

(a) Identify the 10 sets of defining equations for this problem. For each one, solve (if 
a solution exists) for the corresponding corner-point solution, and classify it as a 
corner-point feasible solution or a corner-point infeasible solution. 

(b) For each cormer-point solution, give the corresponding basic solution and its set of 
nonbasic variables. 


9. Reconsider Prob. 3 in Chap. 3. 

(a) Identify the 15 sets of defining equations for this problem. For each one, solve (if 
a solution exists) for the corresponding corner-point ‘solution, and classify it as a 
corner-point feasible solution or a corner-point infeasible solution. 

(b) For each corner-point solution, give the corresponding basic solution and its set of 
nonbasic variables. 


10.* Reconsider Prob. 23 in Chap. 4. Now you are given the information that the basic 
variables in the optimal solution are x, and x;. Use this information to identify a system of 
three constraint boundary equations whose simultaneous solution must be this optimal solution. 
Then solve this system of equations to obtain this solution. 


11. Reconsider Prob. 11 in Chap. 4.-Using the given information and the theory of the 
simplex method, analyze the constraints of the problem in order to identify a system of three 
constraint boundary equations whose simultaneous solution must be the optimal solution (not 
augmented). Then solve this system of equations to obtain this solution. 


12. Consider the following problem. 
Maximize Z = 2x, + 2x, + 3%, 
subject to 2x, + xX, + 2x354 
X, +X, + x3,53 
and x, 20, x, 20, x; 20. 


Let x, and x; be the slack variables for the respective functional constraints. Starting with these 
two variables as the basic variables for the initial basic feasible solution, you now are given 
the information that the simplex method proceeds as follows to obtain the optimal solution in 
two iterations: (1) in iteration 1, the entering basic variable is x; and the leaving basic variable 
is x4; (2) in iteration 2, the entering basic variable is x, and the leaving basic variable is x;. 

(a) Develop a three-dimensional drawing of the feasible region for this problem, and 

show the path followed by the simplex. method. 
(b) Give a geometric interpretation of why the simplex method followed this path. 
(c) For each of the two edges of the feasible region traversed by the simplex method, 


give the equation of each of the two constraint boundaries on which it lies, and then 
give the equation of the additional constraint boundary at each endpoint. 

(d) Identify the set of defining equations for each of the three corner-point feasible 
solutions (including the initial one) obtained by the simplex method. Use the defining 
equations to solve for these solutions. 

(e) For each corner-point feasible solution obtained in part (d), give the corresponding 
basic feasible solution and its set of nonbasic variables. Explain how these nonbasic 
variables identify the defining equations obtained in part (d). 


13. Consider the following problem. 
Maximize Z = 3x, + 4x. + 2x3, 
subject to xX, + xX, + x, = 20 
xı + 2x, + x, = 30 


and x = 0, x, 20, x; = 0. 


Let x, and x, be the slack variables for the respective functional constraints. Starting with these 

two variables as the basic variables for the initial basic feasible solution, you now are given 

the information that the simplex method proceeds as follows to obtain the optimal solution in 

two iterations: (1) in iteration 1, the entering basic variable is x, and the leaving basic variable 

is xs; (2) in iteration 2, the entering basic variable is x, and the leaving basic variable is x4. 
Follow the instructions of Prob. 12 for this situation. 


14. By inspecting Fig. 5.2, explain why Property 1(b) for corner-point feasible solutions 
holds for this problem if it has the following objective function. 

(a) Maximize Z = x;. 

(b) Maximize Z = —x, + 2x3. 


15. Consider the three-variable linear programming problem shown in Fig. 5.2. 

(a) Explain in geometric terms why the set of solutions satisfying any individual con- 
straint is a convex set as defined in Appendix 1. 

(b) Use the conclusion in part (a) to explain why the entire feasible region (the set of 
solutions that simultaneously satisfies every constraint) is a convex set. 


16. Suppose that the three-variable linear programming problem given in Fig. 5.2 has 
the objective function, 


Maximize Z = 3x, + 4x, + 3x3. 


Without using the algebra of the simplex method, apply just its geometric reasoning (including 
choosing the edge giving the maximum rate of increase of Z) to determine and explain the path 
it would follow in Fig. 5.2 from the origin to the optimal solution: 


17. Consider the three-variable linear programming problem shown in Fig. 5.2. 

(a) Construct a table like Table 5.4 giving the indicating variable for each constraint 
boundary equation and original constraint. 

(b) For the corner-point feasible solution (2, 4, 3) and its three adjacent corner-point 
feasible solutions (4, 2, 4), (0, 4, 2), and (2, 4, 0), construct a table like Table 5.5 
giving the corresponding defining equations, basic feasible solution, and nonbasic 
variables. 

(c) Use the sets of defining equations from part (b) to demonstrate that (4, 2, 4), (0, 4, 
2), and (2, 4, 0) indeed are adjacent to (2, 4, 3), but that none of these three corner- 
point feasible solutions are adjacent to each other. Then use the sets of nonbasic 
variables from part (b) to demonstrate the same thing. 
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18. The formula for the line passing through (2, 4, 3) and (4, 2, 4) in Fig. 5.2 can be 
written as 


(2,4, 3) + af, 2, 4) — (2,4, 3] = 2, 4, 3). + a, —2, 1), 


where 0 = a = 1 for just the line segment between these points. After augmenting with the 
slack variables x4, x5, Xs, X7 for the respective functional constraints, this formula becomes 


(2, 4, 3, 2,0, 0, 0) + a(2, —2, 1, —2, 2, 0, 0). 


Use this formula directly to answer each of the following questions, and thereby relate the 
algebra and geometry of the simplex method as it goes through one iteration in moving from 
(2, 4, 3) to (4, 2, 4). (You are given the information that it is moving along this line segment.) 

(a) What is the entering basic variable? 

(b) What is the leaving basic variable? 

(c) What is the new basic feasible solution? 


19. Consider a two-variable mathematical programming problem that has the feasible 
region shown on the graph, where the six dots correspond to corner-point feasible solutions. 
The problem has a linear objective function, and the two dashed lines are objective function 
lines passing through the optimal solution (4, 5) and the second-best corner-point feasible 
solution (2, 5). Note that the nonoptimal solution (2, 5) is better than both of its adjacent 
comer-point feasible solutions, which violates Property 3 in Sec. 5.1 for corner-point feasible 
solutions in linear programming. 





0 1 2 3 4 x 


Demonstrate that this problem cannot be a linear programming problem by constructing 
the feasible region that would result if the six line segments on the boundary were constraint 
boundaries for linear programming constraints. 


20. Consider the following problem. 
Maximize Z = 8x, + 4x, + 6x3 + 3x4 + 9X5, 


subject to x, + 2x, + 3x4 + 3x4 = 180 (resource 1) 145 


4x, + 3x, + 2x, + x4 + x5 270 (resource 2) prided 
x, + 3x, + x4 + 3x5 = 180 (resource 3) 
and x, 20 (j= 1,...,5). 


You are given the facts that the basic variables in the optimal solution are x3, x,, and x5, and 
that 
1 


3 1 of 11 —3 1 
3 4 1 = 77] -6 9 -3I. 
0 1 3 2 -3 10 


(a) Use the given information to identify the optimal solution. 
(b) Use the given information to identify the shadow prices for the three resources. 


21.* Use the revised simplex method to solve the following problem. 
Maximize Z = 5x, + 8x, + 7x3 + 4x4 + 6xs, 


subject to 2x, + 3x, + 3x3 + 2x4 + 2x; = 20 


3x, + 5x3 + 4x3 + 2x4 + 4x5 = 30 
and x; = 0, forj = 1, 2,3, 4,5. 


22. Use the revised simplex method to solve the linear programming model given in 
Prob. 4, Chap. 4. 


23. Reconsider Prob. 1. For the sequence of corner-point solutions identified in part 
(e), construct the basis matrix B for each of the corresponding basic feasible solutions. For 
each one, invert B manually and use this B~? to calculate the current solution and then perform 
the next iteration (or demonstrate that the current solution is optimal). 


24. Use the revised simplex method to solve the linear programming model given in 
Prob. 2, Chap. 4. 


25. Use the revised simplex method to solve the linear programming model given in 
Prob. 7, Chap. 4. 


26. Use the revised simplex method to solve each of the following linear programming 
models: 

(a) Model given in Prob. 4, Chap. 3. 

(b) Model given in Prob. 9, Chap. 4. 


27.* Consider the following problem. 


Maximize Z =x, — X, + 2x3, 


subject to 2x, — 2x, + 3x3 55 
xX) t+ xX, — X353 
P —- X + x32 


and x, 20, xX 20, x, > 0. 
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Let x4, Xs, and x, denote the slack variables for the respective constraints. After applying the 
simplex method, a portion of the final simplex tableau is as follows: 


Coefficient of 














(a) Use the fundamental insight presented in Sec. 5.3 to identify the missing numbers 
in the final simplex tableau. Show your calculations. 

(b) Identify the defining equations of the corner-point feasible solution corresponding 
to the optimal basic feasible solution in the final simplex tableau. 


28. Consider the following problem. 
Maximize Z = 4x, + 3x, + x3 + 2x4, 
subject to 4x, + 2x, + x3 + x4=5 
3x, + X% + 2x, + x454 
and x, =0, xX, = 0, x, = 0, x4, = 0. 


Let x; and x, denote the slack variables for the respective constraints. After applying the simplex 
method, a portion of the final simplex tableau is as follows: 












: 
J. 
KEN i 


1 1 -1 
2 -1 2 


(a) Use the fundamental insight presented in Sec. 5.3 to identify the missing numbers 
in the final simplex tableau. Show your calculations. 

(b) Identify the defining equations of the corner-point feasible solution corresponding 
to the optimal basic feasible solution in the final simplex tableau. 


Basic 
Variable 


Right 
Side 


29. Consider the following problem. 
Maximize Z = 6X, + x, + 2X3, 
subject to 2x, + 2x, + $x, S 2 
—4x, — 2x, — $x, 3 
x + 2x, + $x S 1 


and x, = 0, x, 20, xX, Z 0. 


et X4, Xs, and x, denote the slack variables for the respective constraints. After applying the 
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Coefficient of 

















Use the fundamental insight presented in Sec. 5.3 to identify the missing numbers in the final 
simplex tableau. Show your calculations. 


30. For iteration 2 of the example in Sec. 5.3, the following expression was shown. 


1 0/1 0 0j 4 
Final row 0 = [-3, -5/0, 0, 0/0] + (0, 3% gfo 210 1 0412). 
3.210 0 1418 


Derive this expression by combining the algebraic operations (in matrix form) for iterations 1 
and 2 that affect row 0. 


31. Consider the following problem. 
Maximize Z= x = X + 2X3, 
subject to X, + x + 3x; = 15 
2X, —X%, + x35 2 
—xX,tm+ 4S 4 
and x, 20, xX, = 0, x, = 0. 


Let x4, x5, and x, denote the slack variables for the respective constraints. After applying the 
simplex method, a portion of the final simplex tableau is as follows: 






Coefficient of 





% n 6B 








(a) Use the fundamental insight presented in Sec. 5.3 to identify the missing numbers 
in the final simplex tableau. Show your calculations. 

(b) Identify the defining equations of the corner-point feasible solution corresponding 
to the optimal basic feasible solution in the final simplex tableau. 
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32. Consider the following problem. 
Maximize Z = 20x, + 6x, + 8x3, 


subject to 8x, + 2x, + 3x3 = 200 
4x, + 3x, = 100 
2x, + x; 50 
x, 20 
and x, 20, x, 20, x; = 0. 


Let x4, Xs, Xg, and x, denote the slack variables for the first through fourth constraints, 
respectively. Suppose that after some number of iterations of the simplex method a portion of 
the current simplex tableau is as follows: 
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(a) Use the fundamental insight presented in Sec. 5.3 to identify the missing numbers 
in the current simplex tableau. Show your calculations. 

(b) Indicate which of these missing numbers would be generated by the revised simplex 
method in order to perform the next iteration. 

(c) Identify the defining equations of the corner-point feasible solution corresponding 
to the basic feasible solution in the current simplex tableau. 


33. Most of the description of the fundamental insight presented in Sec. 5.3 assumes 
that the problem is in our standard form. Now consider each of the following other forms, 
where the additional adjustments in the initialization step are those presented in Sec. 4.6, 
including using artificial variables and the Big M method where appropriate. Describe the 
resulting adjustments in the fundamental insight. 

(a) Equality constraints. 

(b) Negative right-hand sides. 

(c) Variables allowed to be negative (with no lower bound). 


34. Reconsider Prob. 18 in Chap. 4. For this model that is not in our standard form, 
construct the complete first simplex tableau for the simplex method, and then identify the 
columns that will contain S* for applying the fundamental insight in the final tableau. Explain 
why these are the appropriate columns. 


35. Consider the following problem. 
Minimize Z = 2x, + 3x, + 2x3, 
subject to xX, + 4x, + 2x, 28 
3x, + 2x, =6 


and x, =0, xX, = 0, x, Z 0. 


Let x, and x, be the surplus variables for the first and second constraints, respectively. Let x; 
and X, be the corresponding artificial variables. After making the adjustments described in Sec. 
4.6 for this model form when using the Big M method, the initial simplex tableau ready to 
apply the simplex method is as follows: 






Coefficient of 


























Variable | No Xi Xz X3 Side 
Z (-4M +2) (-6M +3) (-2M +2 — 14M 
4 2 8 





2 0 















X7 





(M — 0.5) 


—0.1 
0.4 








(a) Based on the above tableaux, use the fundamental insight presented in Sec. 5.3 to 
identify the missing numbers in the final simplex tableau. Show your calculations. 

(b) Examine the mathematical logic presented in Sec. 5.3 to validate the fundamental 
insight (see the T* = MT and t* = t + vT equations and the subsequent derivations 
of M and v). This logic assumes that the original model fits our standard form, 
whereas the current problem does not fit this form. Show how, with minor adjust- 
ments, this same logic applies to the current problem when t is row 0 and T is rows 
1-2 in the initial simplex tableau given above. Derive M and v for this problem. 

(c) When applying the t* = t + vT equation, another option is to use t = [2320 
M 0 M, 0], which is the preliminary row 0 before algebraically eliminating the 
nonzero coefficients of the initial basic variables, x; and x,. Repeat part (b),for this 
equation with this new t. After deriving the new v, show that this equation yields 
the same final row O for this problem as the equation derived in part (b). 

(d) Identify the defining equations of the corner-point feasible solution corresponding 
to the optimal basic feasible solution in the final simplex tableau. 


36. Consider the following problem. 


Maximize Z = 2x, + 4x, + 3x3, 


subject to X, + 3x. + 2x, = 20 
x, + 5x, = 10 
and x, = 0, XxX, =0, x, = 0, 


Let x, be the artificial variable for the first constraint. Let x, and X, be the surplus variable and 
artificial variable, respectively, for the second constraint. 
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You are now given the information that a portion of the final simplex tableau is as 
follows: 











(a) Extend the fundamental insight presented in Sec. 5.3 to identify the missing numbers 
in the final simplex tableau. Show your calculations. 

(b) Identify the defining equations of the corner-point feasible solution corresponding 
to the optimal solution in the final simplex tableau. 


37. Consider the following problem. 
Maximize Z = 3x, + 7x, + 2x3, 
subject to —2x, + 2x, + x; = 10 
3x, + XxX, — x, = 20 
and x, 20, xX, 20, x32 0. 


You are given the fact that the basic variables in the optimal solution are x, and x3. 

(a) Introduce slack variables, and then use the given information to find the optimal 
solution directly by Gaussian elimination (see Appendix 4). 

(b) Extend the work in part (a) to find the shadow prices. 

(c). Use the given information to identify the defining equations of the optimal corner- 
point feasible solution, and then solve these equations to obtain the optimal solution. 

(d) Construct the basis matrix B for the optimal basic feasible solution, invert B man- 
ually, and then use this BT? to solve for the optimal solution and the shadow prices 
(y*). Then apply the optimality test for the revised simplex method to verify that 
this solution is optimal. 

(e) Given B~! and y* from part (d), use the fundamental insight presented in Sec. 5.3 
to construct the complete final simplex tableau. 


6 


Duality Theory and 
Sensitivity Analysis 


One of the most important discoveries in the early development of linear programming 
was the concept of duality and its many important ramifications. This discovery re- 
vealed that every linear programming problem has associated with it another linear 
programming problem called the dual. The relationships between the dual problem 
and the original problem (called the primal) prove to be extremely useful in a variety 
of ways. For example, you soon will see that the shadow prices described in Sec. 4.7 
actually are provided by the optimal solution for the dual problem. We shall describe 
many other valuable applications of duality theory in this chapter as well. 

One of the key roles of duality theory is that of the interpretation and imple- 
mentation of sensitivity analysis. As we already mentioned in Secs. 2.3, 3.3, and 4.7, 
sensitivity analysis is a very important part of almost every linear programming study. 
Because some or all of the parameter values used in the original model are just 
estimates of future conditions, the effect on the optimal solution if other conditions 
prevail instead needs to be investigated. Furthermore, certain parameter values (such 
as resource amounts) may represent managerial decisions, in which case the choice 
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152 of the parameter values may be the main issue to be studied, which would be done 
Linear Programming through sensitivity analysis. 

For greater clarity, the first three sections discuss duality theory under the as- 
sumption that the primal linear programming problem is in our standard form (but 
with no restriction that the b; need to be positive). Other forms are then discussed in 
Sec. 6.4. We begin the chapter by introducing the essence of duality theory and its 
applications. We then describe the economic interpretation of the dual problem (Sec. 
6.2) and delve deeper into the relationships between the primal and dual problems 
(Sec. 6.3). Section 6.5 focuses on the role of duality theory in sensitivity analysis. 
The basic procedure for sensitivity analysis (which is based on the fundamental insight 
of Sec. 5.3) is summarized in Sec. 6.6 and illustrated in Sec. 6.7. 


6.1 The Essence of Duality Theory 


Using our standard form for the primal problem at the left (perhaps. after conversion 
from another form), its dual problem has the form shown to: the right. 








Primal problem Dual problem 
n m 
Maximize Z = > LEA Minimize y, = > byn 
j= = 
subject to subject to 
n m 
D ajx = b; fori = 1,2,..., m D ayy; = cj forj=1,2,...,n 
j=l i=] 
and and 








x,20, forj=1,2,...,n. y, 20, fori= 1,2,..., m. 


Thus the dual problem uses exactly the same parameters as the primal problem, but 
in different locations. To highlight the comparison, now see these same two problems 
in matrix notation (as introduced at the beginning of Sec. 5.2), where c and y = 








[Yis Yor - - -> Ym] are row vectors but b and x are column vectors. 
Primal problem Dual problem 
Maximize Z = cx, Minimize yg = yb, 
subject to subject to 
Ax=b yA=c 
and and 


x= 0. y2=0. 








To illustrate, the. primal and. dual problems for the Wyndor Glass Co. example of 
Sec..3.1 are shown in Table 6.1 in matrix form. 

The primal-dual table for linear programming (Table 6.2) also helps to high- 
light the. correspondence. between. the two problems. It shows all the linear program- 
ming parameters (the a;;, b;, and c;) and how they are used to construct the two 
problems. All the headings for the primal problem are horizontal, whereas the headings 
for the dual problem. are read by. turning. the book sideways. We suggest that you 
begin by looking. at each problem. individually. by covering up the headings for the 


Table 6.1 Primal and Dual Problems for Wyndor Glass Co. Example 
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Primal problem Dual problem a pe an 
: Sensitivity Analysis 
PEE PE Xi , 
Maximize Z = [3, af Minimize yo = [Yp yx yal] 12], 
18 
subject to 
subject to 
1 Olr, 4 
0 2 | =] 12 1 0 
32 18 [yi Ya Ys] 0 2 | = [3,5] 
3 2 


and 


and 


[yi> Y2» Y3] = [0, 0, 0]. 








other problem with your hands. Then, after you see what the table is saying for the 
individual problems, compare them. 

Particularly notice in Table 6.2 how (1) the parameters for a constraint in either 
problem are the coefficients of a variable in the other problem, and (2) the coefficients 
for the objective function of either problem are the right sides for the other problem. 
Thus there is a direct correspondence between these entities in the two problems, as 
summarized in Table 6.3. These correspondences are a key to some of the applications 


of duality theory, including sensitivity analysis. 


Table 6.2 Primal-Dual Table for Linear Programming, 
Illustrated by Wyndor Glass Co. Example 


(a) General case 
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Coefficients for 
Objective Function 


(Maximize) 


(b) Wyndor Glass Co. example 
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Table 6.3 Correspondence between 
Entities in Primal and Dual Problems 





One Problem Other Problem 
Constraint i variable 7 
Objective function right sides 


Origin of the Dual Problem 


Duality theory is based directly on the fundamental insight (particularly with regard 
to row 0) presented in Sec. 5.3. To see why, we continue to use the notation introduced 
in Table 5.9 for row 0 of the final tableau, except for replacing Z* by yọ and dropping 
the asterisks from z* and y* when referring to any tableau. Thus, at any given iteration 
of the simplex method for the primal problem, the current numbers in row 0 are 
denoted as shown in the (partial) tableau given in Table 6.4. Also recall [see Eq. (1) 
in the Mathematical Summary subsection of Sec. 5.3] that the fundamental insight 
led to the following relationships between these quantities and the parameters. of the 
original model: 


m 
Y = yb = > bin 


m 
z= yA, 80% = Š ayi forj = 1,2,...,n. 


The remaining key is to express what the simplex method tries to accomplish 
(according to the optimality test) in terms of these symbols. Specifically, it seeks a 
set of basic variables, and the corresponding basic feasible solution, such that all 
coefficients in row 0 are nonnegative. It then stops with this optimal solution. This 
goal is expressed symbolically as follows: 


Condition for Optimality: Be, = 0; forj=1,2,...,n 
y, 2 0, fori =1,2,...,m. 
After substituting the preceding expression for z;, the condition for optimality says 
that the simplex method can be interpreted as seeking values for y,, yo, - - - s Ym Such 
that 


Yo = > byi 






subject to 
2 ayi Z cp forj = 1,2,...,7 


and 


y= 0, fori= 1,2,..., m. 








Table 6.4 Notation for Entries in Row 0 of Simplex Tableau 





Coefficient o; 
Basic | Eq. ff f Right 
Iteration | Variable | No. X2 ae Xn Xni Xas 77° Side 
ESR Girik BFE y Ge = ey) Yı »2 he Yo 


But, except for lacking an objective for yp, this problem is precisely the dual problem! 
To complete the formulation, let us now explore what the missing objective should 
be. 








Since yọ is just the current value of Z,.and since the objective for the primal 
problem is to maximize Z, a natural first reaction is that yọ should be maximized also. 
However, this is not correct for the following rather subtle reason: The only feasible 
solutions for this new problem are those that satisfy the condition for optimality for 
the primal problem. Therefore, it is only the optimal solution for the primal problem 
that corresponds to a feasible solution for this new problem. As a consequence, the 
optimal value of Z in the primal problem is the minimum feasible value of yo in the 
new problem, so yọ should be minimized. (The full justification for this conclusion is 
provided by the relationships we develop in Sec. 6.3.) Adding this objective of min- 
imizing yo gives the complete dual problem. 

Consequently, the dual problem may be viewed as a restatement in linear pro- 
gramming terms of the goal of the simplex method, namely, to reach a solution for 
the primal problem that satisfies the optimality test. Before this goal has been reached, 
the corresponding y in row 0 (coefficients of slack variables) of the current tableau 
must be infeasible for the dual problem. However, after the goal is reached, the 
corresponding y must be an optimal solution (labeled y*) for the dual problem, because 
it is a feasible solution that attains the minimum feasible value of yọ. This optimal 
solution (y¥, y3, . . . , y) provides for the primal problem the shadow prices that 
were described in Sec. 4.7. Furthermore, this optimal yọ is just the optimal value of 
Z, so the optimal objective function values are equal for the two problems. This fact 
also implies that ex = yb for any x and y that are feasible for the primal and dual 
problems, respectively. 

To illustrate, Table 6.5 shows row 0 for the respective iterations when the 
simplex method is applied to the Wyndor Glass Co. example. In each case, row 0 is 
partitioned into three parts: the coefficients of the original variables (x,, x,), the 
coefficients of the slack variables (x3, x4, xs), and the right-hand side (value of Z). 
Each row 0 identifies a solution for the dual problem, as shown to its right in Table 
6.5. Included are the values of 


z= e = y + Bye 3, 


Z2 — Co 2y + 2y; — 5, 


Table 6.5 Respective Row 0’s and Corresponding Dual Solutions for Wyndor Glass Co. Example 











Primal Problem 


Row 0 


Dual Problem 










Tteration 





[-3, / 0, 0 Of; OF 
[-3, 0Oj;0, 3 O | 30] 
[ 0 10, ¢ | 
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the surplus variables for the functional constraints of the dual problem, yi + 3y; = 
3 and 2y, + 2y, = 5. Thus a negative value for either surplus variable indicates that 
the corresponding constraint is violated. Also included is the value of the dual objec- 
tive function, yọ = 4y, + 12y, + 18y3. As displayed in Table 6.4, all of these 
quantities are identified by row 0. 

For the initial row 0, Table 6.5 shows that the corresponding dual solution, 
(Yis Yo, ¥3) = (0, 0, 0), is infeasible because both surplus variables are negative. The 
first iteration succeeds in eliminating one of these negative values, but not the other. 
After two iterations, the optimality test is satisfied for the primal problem because all 
of the dual variables and surplus variables are nonnegative. This dual solution, 
(yi. y3, ¥3) = (0, 3, 1), is optimal (as could be verified by applying the simplex 
method directly to the dual problem), so the optimal value of Z and yọ is 
Z* = 36 = ye. 


Summary of Primal-Dual Relationships 


Now let us summarize the newly discovered key relationships between the primal and 
dual problems. 


Weak duality property: If x is a feasible solution for the primal problem and 
y is a feasible solution for the dual problem, then 


cx = yb. 


For example, for the Wyndor Glass Co. problem, one feasible solution is (using the 
superscript T to denote the transpose operation described in Appendix 3) x = [3, 3)", 
which yields Z = cx = 24, and one feasible solution for the dual problem is y = 
(1, 1, 2], which yields a larger objective function value, yọ = yb = 52. For any 
such pair of feasible solutions, this inequality must hold because the maximum feasible 
value of Z = ex (36) equals the minimum feasible value of the dual objective function 
Yo = yb, which is our next property. 


Strong duality property: If x* is an optimal solution for the primal problem 
and y*¥ is an optimal solution for the dual problem, then 


cx* = y*b. 


Complementary solutions property: At each iteration, the simplex method 
simultaneously identifies a corner-point feasible solution x for the primal prob- 
lem and a complementary solution y for the dual problem (found in row 0, 
coefficients of the slack variables), where 


cx = yb. 
If x is not optimal for the primal problem, then y is not feasible for the dual 
problem. 


To illustrate the complementary solutions property, after one iteration for the Wyndor 
Glass Co. problem, x = [0, 6]’ and y = [0, ž, 0], with ex = 30 = yb. 


Complementary optimal solutions property: At the final iteration, the sim- 
plex method simultaneously identifies an optimal solution x* for the primal 
problem and a complementary optimal solution y* for the dual problem (found 
in row 0, coefficients of the slack variables), where 


ex* = y*b. 


The y¥ are the shadow prices for the primal problem. 


For the example, the final iteration yields x* = [2, 6]' and y* = [0, 3, 1], with 
cx* = 36 = y*b. 

We shail take a closer look at some of these properties in Sec. 6.3. There you 
will see that the complementary solutions property can be extended considerably 
further. In particular, after slack and surplus variables are introduced to augment the 
respective problems, every basic solution in the primal problem has a complementary 
basic solution in the dual problem. We already have noted that the simplex method 
identifies the values of the surplus variables for the dual problem as the (z; — c;) in 
Table 6.4. This result then leads to an additional complementary slackness property 


that relates the basic variables in one problem to the nonbasic variables in the other - 


(Tables 6.7 and 6.8), but more about that later. 

In Sec. 6.4, after describing how to construct the dual problem when the primal 
problem is not in our standard form, we discuss another very useful property, which 
is summarized as follows: 


Symmetry property: For any primal problem and its dual problem, all rela- 
tionships between them must be symmetric because the dual of this dual problem 
is this primal problem. 


Therefore, all of the preceding properties hold regardless of which of the two problems 
is labeled as the primal problem. (The direction of the inequality for the weak duality 
property does require that the primal problem be expressed or reexpressed in max- 
imization form and the dual problem in minimization form.) Consequently, the sim- 
plex method can be applied to either problem, and it simultaneously will identify com- 
plementary solutions (ultimately a complementary optimal solution) for the other 
problem. 


Applications 


As we have just implied, one important application of duality theory is that the dual 
problem can be solved directly by the simplex method in order to identify an optimal 
solution for the primal problem. We discussed in Sec. 4.8 that the number of functional 
constraints affects the computational effort of the simplex method far more than the 
number of variables. If m > n, so that the dual problem has fewer functional con- 
straints (n) than the primal problem (m), then applying the simplex method directly 
to the dual problem instead of the primal problem probably will achieve a substantial 
reduction in computational effort. 

The weak and strong duality properties describe key relationships between the 
primal and dual problems. One useful application is for evaluating a proposed solution 
for the primal problem. For example, suppose that x is a feasible solution that has 
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been proposed for implementation, and that a feasible solution y has been found by 
inspection for the dual problem: such that ex = yb. In this case, x must be optimal 
without even applying the simplex method! Even if ex < yb, then yb still provides 
an upper bound on the optimal value of Z, so if (yb — ex) is small, intangible factors 
favoring x may lead to its selection without further ado. 

One of the key applications of the complementary solutions property is its use 
in the dual simplex method presented in Sec. 9.2. This. algorithm operates on the 
primal problem exactly as if the simplex method were being applied simultaneously 
to the dual problem, which can be done because of this property. Because the roles 
of row 0 and the right side in the simplex tableau have been reversed, the dual simplex 
method requires that row 0 begins and remains nonnegative while the right side begins 
with some negative values (subsequent iterations strive to. reach a nonnegative right 
side). Consequently, this algorithm occasionally is used because it is more convenient 
to set up the initial tableau in this form than in the form required by the simplex 
method. Furthermore, it frequently is used for reoptimization (discussed in Sec. 4.7), 
because changes in the original model lead to the revised final tableau fitting this 
form. This situation is common for certain types of sensitivity analysis, as you will 
see later in the chapter. 

In general terms, duality theory plays a central role in sensitivity analysis. This 
role is the topic of Sec. 6.5. 

Another important application is its use in the economic interpretation of the 
dual problem and the resulting insights for analyzing the primal problem. You already 
have seen one example when we discussed shadow prices in Sec. 4.7. The next section 
describes how this interpretation extends to the entire dual problem and then to the 
simplex method. 


6.2 Economic Interpretation of Duality 





The economic interpretation of duality is based directly upon the typical interpretation 
for the primal problem (linear programming problem in our standard form) presented 
in Sec. 3.2. To refresh your memory, we have summarized this interpretation of the 
primal problem in Table 6.6. 


Interpretation of the Dual Problem 


To see how this interpretation of the primal problem leads to an economic interpre- 
tation for the dual problem,’ note in Table 6.4 that y, is the value of Z (total profit) 
at the current iteration. Because 


Yo = byi + baya +00 + Dams 
each b,y, can thereby be interpreted as the current contribution to profit by having b; 
units of resource i available for the primal problem. Thus 


y; is interpreted as the contribution to profit per unit of resource i (i = 1,2,..., m), 
when the current set of basic variables is used to obtain the primal solution. 


' Actually, several slightly different interpretations have been proposed. The one presented here seems to 
us to be the most useful because it also directly interprets what the simplex method does in the primal 
problem. 


Table 6.6 Economic Interpretation of Primal Problem 





Quantity Interpretation 
X Level of activity j G=1,2,...,n) 
c Unit profit from activity j 
Z Total profit from all activities 
b; Amount of resource i available (i= 1,2,...,m) 


Amount of resource i consumed by each unit of activity j 





In other words, the y; (or y; in the optimal solution) are just the shadow prices 
discussed in Sec. 4.7. 

This interpretation of the dual variables leads to our interpretation of the overall 
dual problem. Specifically, since each unit of activity j in the primal problem consumes 
a; units of resource i, 


2/1 ayy; is interpreted as the current contribution to profit of the mix of resources 
that would be consumed if one unit of activity j were used (j = 1, 2,...,n). 


This same mix of resources (and more) probably can be used in other ways as 
well, but no alternative use should be considered if it is less profitable than one unit 
of activity j. Since c, is interpreted as the unit profit from activity j, each functional 
constraint in the dual problem is interpreted as follows: 


71 ayy; = c; says that the actual contribution to profit of the above mix of resources 
must be at least as much as if they were used by one unit of activity j; otherwise, we 
would not be making the best possible use of these resources. 


Similarly, the interpretation of the nonnegativity constraints is the following: 


y; = 0 says that the contribution to profit of resource i (i = 1,2, .. . , m) must be 
nonnegative; otherwise, it would be better not to use this resource’ at all. 


The objective, 


m 
Minimize yo = > bi; 
i=] 
can be viewed as minimizing the total implicit value of the resources consumed by 
the activities. 

This interpretation can be sharpened somewhat by differentiating between basic 
and nonbasic variables in the primal problem for any given basic feasible solution 
Xis X2, >- » » Xn+m). Recall that the basic variables (the only variables whose values 
can be nonzero) always have a coefficient of zero in row 0. Therefore; referring again 
to Table 6.4 and the accompanying equation for z,, 


m 


X ai= cp ity 0 G Silidh; 


i= 


y, = 0, if xp; > 0 @=1,2,...,m). 


(This is one version of the complementary slackness property discussed in the next 
section.) The economic interpretation of the first statement is that whenever an activity 
j Operates at a strictly positive level (x; > 0), the marginal value of the resources it 
consumes must equal (as opposed to exceeding) the unit profit from this activity.. The 
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second statement implies that the marginal value of resource i is..zero (y; = 0) 
whenever the supply of this resource is not exhausted by the activities (x,,; > 0). In 
economic terminology, such a resource is a ‘‘free good’’; the price of goods that are 
oversupplied must drop to zero by the law of supply and demand. This fact is what 
justifies interpreting the objective for the dual problem as minimizing the total implicit 
value of the resources consumed, rather than the resources: allocated. 


Interpretation of the Simplex Method 


The interpretation of the dual problem also provides an economic interpretation of 
what the simplex method does in the primal problem. The goal of the simplex method 
is to find how to use the available resources in the most profitable feasible way. To 
attain this goal we must reach a basic feasible solution that satisfies all the requirements 
on profitable use of the resources (the constraints of the dual problem). These require- 
ments comprise the condition for optimality for the algorithm. For any given basic 
feasible solution, the requirements (dual constraints) associated with the basic vari- 
ables are automatically satisfied (with equality). However, those associated with non- 
basic variables may or may not be satisfied. 

In particular, if an original variable x; is nonbasic so that activity j is not used, 
then the current contribution. to- profits of. the resources that would be required to 
undertake each unit of activity j, 


m 

Š ayy, 
ti 

i=l 


may be either smaller (<) or larger (=) than the unit profit c, obtainable from the 
activity. If it is smaller, so (z; — c;) < 0 in row 0 of the simplex tableau, then these 
resources can be used more profitably by initiating this activity. If it is larger, then 
these resources already are being assigned elsewhere in a more profitable way, so 
they should not be diverted to activity j. 

Similarly, if a slack variable x, ,.; is nonbasic so that the total allocation b; of 
resource i is being used, then y; is the current contribution to profit of this resource 
on a marginal basis. Hence, if y; < 0, profit can be increased by cutting back on the 
use of this resource (i.e., increasing x,.;). If y; = 0, it is worthwhile to continue 
fully using this resource. 

Therefore, what the simplex method does is to examine all the nonbasic variables 
in the current basic feasible solution to see which ones can provide a more profitable 
use of the resources by being. increased. If. none can, so that no feasible. shifts or 
reductions in the current proposed use of the resources can increase profit, the current 
solution must be optimal. If one or more can, the simplex method selects the variable 
that, if increased by 1, would improve. the profitability of the use of the resources. the 
most. It then actually increases this variable (the entering basic variable) as much as 
it can until the marginal values of the resources change. This increase results in a 
new basic feasible solution with a new row 0 (dual solution), and the whole process 
is repeated. 

To solidify your understanding of this interpretation of the simplex method, we 
suggest that you. apply it to the Wyndor Glass Co. problem, using both Fig. 3.2 and 
Table 4.8. (See Prob. 6.) 

The economic interpretation of the dual problem considerably expands our abil- 
ity to analyze the primal. problem. However, you already have seen in Sec. 6.1 that 


this interpretation is just one ramification of the relationships between the two prob- 161 
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6.3 Primal-Dual Relationships 





Because the dual problem is a linear programming problem, it also has corner-point 
solutions. Furthermore, by using the augmented form of the problem, we can express 
these corner-point solutions as basic solutions. Because the functional constraints have 
the = form, this augmented form is obtained by subtracting the surplus (rather than 
adding the slack) from the left-hand side of each constraint j (j = 1,2,...,n).1 
This surplus is 


z= o= Dy ayy — cp forj=1,2,...,n. 
i=] 

Thus (z; — c;) plays the role of the surplus variable for constraint j (or its slack 
variable if the constraint is multiplied through by — 1). Therefore, augmenting each 
corner-point solution (y1, Ya, - <- s Ym) yields a basic solution (yi, yo, >. © 5 Yms 
Zi — Cy, % T Cn- -s Zn — C,) by using this expression for (z; — c;). Since the 
augmented form has n functional constraints and (n + m) variables, each basic so- 
lution has basic variables and m nonbasic variables. (Note how m and n reverse 
their previous roles here because, as Table 6.3 indicates, dual constraints correspond 
to primal variables and dual variables correspond to primal constraints.) 


Complementary Basic Solutions 


One of the important relationships between the primal and dual problems is a direct 
correspondence between their basic solutions. The key to this correspondence is row 
0 of the simplex tableau for the primal basic solution, such as shown in Table 6.4 or 
6.5. Such a row 0 can be obtained for any primal basic solution, feasible or not, by 
using the formulas given in the bottom part of Table 5.7. 

Note again in Tables 6.4 and 6.5 how a complete solution for the dual problem 
(including the surplus variables) can be read directly from row 0. Thus, because of 
its coefficient in row 0, each variable in the primal problem has an associated variable 
in the dual problem, as summarized in Table 6.7. 

A key insight here is that the dual solution read from row 0 must also be a basic 
solution! The reason is that the m basic variables for the primal problem are required 


Table 6.7 Association between Variables in Primal and Dual Problems 









Primal Variable Associated Dual Variable 
(Original variable) x; (z; — ¢;) (surplus variable), j = 1,2,...,n 
(Slack variable) x,.; | y (original variable), i = 1,2,...,m 


1 You might wonder why we do not also introduce artificial variables into these constraints as discussed 
in Sec. 4.6. The reason is that these variables have no purpose other than to change the feasible region 
temporarily as a convenience in starting the simplex method. We are not interested now in applying the 
simplex method to the dual problem, and we do not want to change its feasible region. 
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to have.a coefficient of zero in row 0, which thereby requires the m associated dual 
variables to be zero, i.e., nonbasic variables for the dual problem. The values of the 
remaining n (basic) variables then will be the simultaneous solution to the system of 
equations given at the beginning of the section. In matrix form, this system of equa- 
tions isz — ¢ = yA — ¢, and the fundamental insight of Sec..5.3 actually identifies 
its solution for z — c and y as being the corresponding entries in row 0. 

Because of the symmetry property quoted in Sec. 6.1 (and the direct. association 
between variables shown in Table 6.7), the correspondence between. basic solutions 
in the primal. and dual problems is a symmetric one. Furthermore, a pair of comple- 
mentary basic solutions has the same objective function value, shown as yọ in Table 
6.4. 

Let us now summarize our conclusions about the correspondence between primal 
and dual basic solutions, where the first property extends the complementary solutions 
property of Sec. 6.1 to the augmented forms of the two problems and then to any 
basic solution (feasible or not) in the primal problem. 


Complementary basic solutions property: Each basic solution in the primal 
problem has a complementary basic solution in the dual problem, where their 
respective objective function values (Z and yọ) are equal. Given row 0 of the 
simplex tableau for the primal basic solution, the complementary dual basic 
solution (y, z — c) is found as shown in Table 6.4. 


The next property shows how to identify the basic and nonbasic variables in this 
complementary basic solution. 


Complementary slackness property: Using the association between variables 
given in Table 6.7, the variables in the primal basic solution and the comple- 
mentary dual basic solution satisfy the complementary slackness relationship 
shown in Table 6.8. Furthermore, this relationship is a symmetric one, so that 
these two basic solutions are complementary to each other. 


The reason for using the name complementary slackness for this latter property 
is that it says (in part) that for each pair of associated variables, if one of them has 
slack in its nonnegativity constraint (a basic variable > 0), then the other one must 
have no slack (a nonbasic variable = 0). We mentioned in Sec. 6.2 that this property 
has a useful economic interpretation for linear programming problems. 


EXAMPLE: To illustrate these two properties, again consider the Wyndor Glass Co. 
problem of Sec. 3.1. All eight of its basic solutions (five feasible and three infeasible) 
are shown in Tables 5.5 and 5.6 along with the corresponding comer-point solutions. 


Table 6.8 Complementary Slackness 
Relationship for Complementary Basic 
Solutions 















Primal 
Variable 


Associated 
Dual Variable 


Nonbasic 
Basic 


Basic 
Nonbasic 









(m variables) 
(n variables) 








Thus its dual problem (see Table 6.1) also must have eight basic solutions, each 
complementary to one of these primal solutions, as shown in Table 6.9. 

The three basic feasible solutions obtained by the simplex method for the primal 
problem are the first, fifth, and sixth primal solutions shown in Table 6.9. You already 
saw in Table 6.5 how the complementary basic solutions for the dual problem can be 
read directly from row 0, starting with the coefficients of the slack variables and then 
the original variables. The other dual basic solutions also could be identified in this 
way by constructing row 0 for each of the other primal basic solutions, using the 
formulas given in the bottom part of Table 5.7. 

Alternatively, for each primal basic solution, the complementary slackness prop- 
erty can be used to identify the basic and nonbasic variables for the complementary 
dual basic solution, so that the system of equations given at the beginning of the 
section can be solved directly to obtain this complementary solution. For example, 
consider the next-to-last primal basic solution in Table 6.9, where x,, x2, and x; are 
basic variables. Using Tables 6.7 and 6.8, we see that the complementary slackness 
property implies that (z) — c), (Z2 — C2), and y; are nonbasic variables for the 
complementary dual basic solution. Setting these variables equal to zero in the dual 
problem equations, y, + 3y; — (z1 — ci) = 3 and 2y, + 2y3 — (2 — c) = 5, 
immediately yields y, = 3, y, = 3. 

Finally, notice that Table 6.9 demonstrates that (0, 3, 1, 0, 0) is the optimal 
solution for the dual problem, because it is the basic feasible solution with minimal 


Yo (36). 


Relationships between Complementary Basic Solutions 


We now turn our attention to the relationships between complementary basic solutions, 
beginning with their feasibility relationships. The middle columns in Table 6.9 provide 
some valuable clues. For the pairs of complementary solutions, notice how the yes 
or no answers on feasibility also satisfy a complementary relationship in most cases. 
In particular, with one exception, whenever one solution is feasible, the other is not. 
(It also is possible for neither solution to be feasible, as happened with the third pair.) 
The one exception is the sixth pair, where the primal solution is known to be optimal. 
The explanation is suggested by the Z = yy column. Because the sixth dual solution 
also is optimal (by the complementary optimal solutions property), with yọ = 36, 
then the first five dual solutions cannot be feasible because yy < 36 (remember that 
the dual problem objective is to minimize yọ). By the same token, the last two primal 
solutions cannot be feasible because Z > 36. 


Table 6.9 Complementary Basic Solutions for Wyndor Glass Co. Example 

















Primal Problem Dual Problem 
No. Basic Solution Feasible? = Basic Solution 
1 (0, 0, 4, 12, 18) (0, 0,0, —3, —5) 
2 (4, 0, 0, 12, 6) (3, 0, 0,0, —5) 
3 (6, 0, —2, 12, 0) (0, 0, 1,0, —3) 
4 (4, 3, 0, 6, 0) (—#, 0, Žž, 0, 0) 
5 (0, 6, 4, 0, 6) (0, 3, 0, ~3, 0) 
6 (2, 6, 2, 0, 0) (0, 3, 1, 0, 0) 
7 (4, 6, 0,0, — (3, ž, 0, 0, 0) 
8 (0, 9, 4, —6, (0, 0, $, $, 0) 
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Table 6.10. Classification. of Basic Solutions 
Satisfies Condition for 
Optimality? 
Yes No 










Yes Suboptimal 


Neither feasible 
nor superoptimal 


Feasible? 










No | Superoptimal 





This explanation is further supported by the strong duality property that optimal 
primal and dual solutions have Z = yp. 

Next, let us state the extension of the complementary optimal solutions property 
of Sec. 6.1 for the augmented forms of the two problems. 


Complementary optimal basic solutions property: Each optimal basic solu- 
tion in the primal problem has ‘a complementary optimal basic solution in the 
dual problem, where their respective objective function values (Z and yo) are 
equal.! Given row 0 of the simplex tableau for the optimal primal solution, 
the complementary optimal dual solution (y*, z* — c) is found as shown in 
Table 6.4. 


To review the reasoning behind this property, note that the dual solution (y*, 
z* — c) must be feasible for the dual problem because the condition for optimality 
for the primal problem requires that all these dual variables (including surplus vari- 
ables) be nonnegative. Since this solution is feasible, it must be optimal for the dual 
problem by the weak duality property. 

Basic solutions can be classified according to whether or not they satisfy each 
of two conditions. One is the condition for feasibility, namely, whether all the vari- 
ables (including slack variables) in the augmented solution ‘are nonnegative. The. other 
is the condition for optimality, namely, whether all the coefficients in row 0 (i.e., all 
the variables in the complementary basic solution) are nonnegative. Our names for 
the different types of basic solutions are summarized in Table 6.10. For example, in 
Table 6.9, primal basic solutions 1, 2, 4, and 5 are suboptimal, 6 is optimal, 7 and 
8 are.superoptimal, and 3 is.neither feasible nor superoptimal. 

Using these definitions, the general relationships between complementary basic 
solutions are summarized in Table 6.11. The resulting range of possible (common) 
values for the objective functions (Z = yo) for the first three pairs given in Table 
6.11 (the last pair can have any value) is shown in Fig. 6.1. Thus, while the simplex 
method is dealing directly with suboptimal basic solutions and working toward opti- 
mality in the primal problem, it is simultaneously dealing indirectly with comple- 
mentary superoptimal solutions. and working toward feasibility in the dual problem. 
Conversely, it sometimes is more convenient (or necessary) to work directly with 


! Because of the symmetry property, it thereby follows that if either problem possesses at least one optimal 
solution, the other must also. The only ways in which both problems can have no optimal solutions are 
when (1) both problems have no feasible solutions, or (2) one problem has no feasible solutions and the 
other problem has an unbounded feasible region that permits improving the objective function value in- 
definitely in the favorable direction. 


Table 6.11 Relationships between 
Complementary Basic Solutions 


Primal Complementary Dual 
Basic Solution Basic Solution 














Suboptimal Superoptimal 
Optimal Optimal 
Superoptimal Suboptimal 
Neither feasible Neither feasible 
nor superoptimal nor superoptimal 


superoptimal basic solutions and to move toward feasibility in the primal problem, 
which is the purpose of the dual simplex method described in Sec. 9.2. 

These relationships prove very useful, particularly in sensitivity analysis, as you 
will see later in the chapter. 


6.4 Adapting to Other Primal Forms 


Thus far it has been assumed that the model for the primal problem is in our standard 
form. However, we indicated at the beginning of the chapter that any linear program- 
ming problem, whether in our standard form or not, possesses a dual problem. There- 
fore, this section focuses on how the dual problem changes for other primal forms. 
Each nonstandard form was discussed in Sec. 4.6, and we pointed out how it 
is possible to convert each one into an equivalent standard form if so desired. These 


Primal problem Dual problem 
2, CX; = Z Yo = by: 
j=l i=l 
Superoptimal Suboptimal 


(optimal) Z* yě (optimal) 





Suboptimal Superoptimal 


Figure 6.1 Range of possible values of Z = yo for certain types of complementary basic solutions. 
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Table 6.12 Conversions to Standard Form for Linear Programming 
Models 


Nonstandard Form Equivalent Standard Form 


Minimize Z Maximize (—Z) 


n 


5 ajx = b; = ayx; = — b; 
j=l j=l 
n n n 
py ajx; = b; 5 ajx; 5 b; and - 5 ayx; = >b; 
j=l j=l j=l 
; POS + 
x; unconstrained in sign (x; x), x} 20, x; 20 





conversions are summarized in Table 6.12. Hence you always have the option of 
converting any model into our standard form and then constructing its dual problem 
in the usual way. To illustrate, we do this for our standard dual problem (it must have 
a dual also) in Table 6.13. Note that what we end up with is just our standard primal 
problem! Since any pair of primal and dual problems can be converted into these 
forms, this fact demonstrates the following key property of primal-dual relationships: 


Symmetry property: For any primal problem and its dual problem, all rela- 
tionships between them must be symmetric, because the dual of this dual problem 
is this primal problem. 


As a result, all the statements made earlier in the chapter about the relationships 
of the dual problem to the primal problem also hold in reverse. 

Another consequence of the symmetry property is that. it is immaterial- which 
problem is called the primal and which is called the dual. In practice, you might see 
a linear programming problem fitting our standard form being referred to as the dual 


Table 6.13 Constructing the Dual of the Dual Problem 





Dual problem Converted to standard form 
Minimize yy = yb, Maximize (—y) = —yb, 
subject to subject to 

yA zc Ps -yA s -=c 
and and 
y20. y20. 














Converted to 
standard form Its dual problem 








Maximize Z = cx, Minimize (—Z) = —cx, 


subject to subject to 


Ax=b 


and 


x= 0. 





problem. The convention is that the model formulated to fit the actual problem is 
called the primal problem, regardless of its form. 

Our illustration of how to construct the dual problem for a nonstandard primal 
problem did not involve either equality constraints or variables unconstrained in sign. 
Actually, for these two forms, a shortcut is available. It is possible to show [see Probs. 
20 and 17(c)] that an equality constraint in the primal problem should be treated just 
like an = constraint in constructing the dual problem except that the nonnegativity 
constraint for the corresponding dual variable should be deleted (i.e., this variable is 
unconstrained in sign). By the symmetry property, deleting a nonnegativity constraint 
in the primal problem affects the dual problem only by changing the corresponding 
constraint into an equality constraint. 

Because of these shortcuts, it is necessary only to convert the primal problem 
into the form shown in either column of Table 6.14. You then construct its dual 
problem in the usual way, using the form shown in the other column. The double- 
sided arrows in Table 6.14 show the specific correspondences between the two forms 
that must be followed. Specifically, an inequality functional constraint in one problem 
corresponds to including a nonnegativity constraint in the other, whereas an equality 
constraint in one problem corresponds to deleting a nonnegativity constraint in the 
other. Beware of mixing the forms in the two columns (e.g., maximize Z with = 
constraints) for defining the primal problem. Such mixing is not allowed for the 
purpose of constructing the dual problem. 

To illustrate this procedure, consider the radiation therapy example presented 
in Sec. 3.4. (Its model is shown on p. 45.) Let this model be our primal problem. 
Before finding its dual problem, we need to convert the primal problem into one of 
the two allowable forms shown in Table 6.14. Let us illustrate it both ways. 

The radiation therapy model already almost fits the form in the second column 
of Table 6.14. The only discrepancy is that the first functional constraint, 0.3x, + 
0.1x, = 2.7, is in = form rather than = or =. However, multiplying through the 
constraint by (— 1) converts it into = form, thereby yielding the allowable form of 
the model shown on the left side of Table 6.15. Its dual problem then is constructed 
in just the usual way (as summarized in Table 6.2a) except for following the form in 
the first column of Table 6.14. The result is shown on the right side of Table 6.15. 
Note that the nonnegativity constraint has been deleted for the second variable (y,) 
because the second functional constraint in the primal problem is an equality con- 
straint. 


Table 6.14 Corresponding Primal-Dual 
Forms 


Primal Problem Dual Problem 
(or Dual Problem) (or Primal Problem) 





Maximize Z (or Yo) | Minimize Yo (or Z) 





Constraint i Variable y; (or x;) 
= fom <————+> y,= 0 
= form <—————_;—> y; = 0 deleted 





Variable x; (or y,) Constraint j 
x = 0 e = form 


x; = 0 deleted = = form 





167 


Duality. Theory and 
Sensitivity Analysis 


168 


Linear Programming 


Table 6.15 One Primal-Dual Form for the Radiation Therapy Example 


Primal problem Dual problem 








r 









Minimize Z = 0.4x, + 0.5%, Maximize yo = —2.7y + 6y, + 673, 
subject to 

—0.3x, — 

0.5x, + 0:5x, = 6 


0.6x, + 0.4x, = 6 


subject to 
0.1x, = -2.7 —0.3y, + 0.5y, + 0.6y3 = 0.4 

—O.1ly, + 0.5y, + 0.4y, = 0.5 
and 


yı =9, y3 = 0 





(y2 unconstrained in sign). 








Equivalently, the form in the first column of Table 6.14 can be used instead to 
set up the primal problem. (This form is needed anyway to apply the simplex method 
as presented in Chap. 4.) Using the first two conversions in Table 6.12, this approach 
leads to the form of the primal problem shown on the left side of Table 6.16. The 
corresponding dual form from the second column of Table 6.14 is then used to con- 
struct the dual problem shown on the right side of Table 6.16. 

Note that the two versions of the primal and dual problems in Tables 6.15 and 
6.16 are completely equivalent, where y} = —y,. This equivalency is inevitable 
because the differences involve substituting only equivalent forms. 

When the simplex method is applied to the nonstandard primal forms, and when 
the artificial variable technique (perhaps supplemented by the Big M method) is used 
to adapt to them, the duality interpretation of row 0 of the simplex tableau must be 
adjusted somewhat. The reason is that the artificial variables and M’s revise the primal 
problem, which thereby changes. its dual problem, so the complementary basic solu- 
tions shown in row 0 are for this revised dual problem. However, after the artificial 
variables have been eliminated (made nonbasic) so that the current solution is a le- 
gitimate basic feasible solution for the original primal problem, row 0 can still be 
used to identify the complementary basic solution for the original dual problem. We 
describe how to do this next. 

Suppose that we use the form in the first column of Table 6.14. For each equality 
constraint i, its artificial variable plays the role of a slack variable, except that M has 
been added initially to the coefficient of this variable in row 0. Therefore, the current 


Table 6.16 The Other Primal-Dual Form for the Radiation Therapy Example 


Primal problem 


Dual problem 









Maximize (—Z) = —0.4x, — 0.5x,, Minimize yọ = 2.7y, + 6y3 — 6y3, 


subject to subject to 
0.3x, + O.1x, = 2.7 
0.5x, + 0.5x, = 


—0.6x, — 0.4x, = —6 


0.3y, + 0.5y3 — 0.6y, = —0.4 
O.1y, + 0.5y5 — 0.4y, = —0.5 


I 
(eN 


and 


yı = 0, Y3 = 0 
(y3 unconstrained in sign). 











value of the corresponding dual variable y; is the current coefficient of this artificial 
variable minus M. If a = constraint has a negative right-hand side initially (perhaps 
because it was converted from a = constraint) so that it has been given an artificial 
variable, the dual variable corresponding to this constraint still equals the coefficient 
of its slack variable. The coefficient of the artificial variable would be ignored in this 
case. Finally, if a variable x; is unconstrained in sign so that it has been replaced by 
the difference of two nonnegative variables QF — x7), then the coefficient of x7, 
denoted by (27 — c;), would be used just as for x. In other words, 


z = > ai Vir 
The coefficient of x7 , namely, (z7 + c;) = — (z7 — c), would be ignored. Except 
for these cases, the coefficients in row 0 would be used just as before (see Sec. 6.3) 
to give the values of the corresponding dual variables. 

To illustrate this procedure, we ask you to refer to the set of simplex tableaux 
given in Table 4.12 for the radiation therapy example. The first three tableaux still 
have artificial variables as basic variables. However, this is not the case for the final 
tableau, so we can use its row 0 to identify the optimal solution for the dual problem 
shown in Table 6.16. The first primal constraint is a standard one, so y, is just the 
coefficient of the first slack variable’ (x3), or y} = 0.5. The second constraint is 
an equality constraint, so we refer to the coefficient of its artificial variable (x4) to 
obtain 

ys = (M—- 1.1) - M = -1,1. 


The third constraint has a negative right-hand side, so we use the coefficient of its 
slack variable (x5) to yield y, = 0..As before,- the surplus variables for the dual 
problem equal the coefficients of the original variables, x, and x3, so (z; — cı) = 0, 
(z2 — Cy) = 0. This completes the optimal dual basic solution, 


(Yi Yz» Y3» Zi 3 Ci, Z2 = C2) = (0.5, —1.1, 0, 0, 0). 
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6.5 The Role of Duality Theory in Sensitivity Analysis 


As described further in the next two sections, sensitivity analysis basically involves 
investigating the effect on the optimal solution of making changes in the values of 
the model parameters (the a;,, b;, and c;). However, changing parameter values in the 
primal problem also changes the corresponding values in the dual problem. Therefore, 
you have your choice of which problem to use to investigate each change. Because 
of the primal-dual relationships presented in Secs. 6.1 and 6.3 (especially the com- 
plementary basic solutions property), it is easy to move back and forth between the 
two problems as desired. In some cases, it is more convenient to analyze the dual 
problem directly in order to determine the complementary effect on the primal prob- 
lem. We begin by considering two such cases. 


Changes in the Coefficients of a Nonbasic Variable 


Suppose that the changes made in the original model occur in the coefficients of a 
variable that was nonbasic in the original optimal solution. What is the effect of these 
changes on this solution? Is it still feasible? Is it still optimal? 
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Because the variable involved is nonbasic (value of zero), changing. its coeffi- 
cients cannot affect the feasibility of the solution. Therefore, the open question. in this 
case is. whether it is still optimal. As Tables 6.10 and 6.11 indicate, an equivalent 
question is whether the complementary basic solution for the dual problem. is still 
feasible. after making these: changes. Since these changes affect the dual problem by 
changing only one constraint, this question. can be answered simply. by checking 
whether this complementary basic solution still satisfies this revised constraint. 

We shall illustrate this case in the corresponding subsection of Sec. 6.7 after 
developing a relevant example. 


Introduction of a New Variable 


As indicated in Table 6.6, the decision variables in the model typically represent the 
level of the various activities under consideration. In some situations, these activities 
were selected from a larger group of possible activities, where the remaining activities 
were not included in the original model because they seemed less attractive. Or perhaps 
these other activities did not come to light until after the original model was formulated 
and solved. Either way, the key question is whether any of these previously uncon- 
sidered activities are sufficiently worthwhile to warrant initiation. In other words, 
would adding any of these activities to the model change the original optimal solution? 

Adding another activity amounts to introducing a new variable, with the ap- 
propriate coefficients in the functional constraints and objective function, into the 
model. The only resulting change in the dual problem is to add a new constraint 
(see Table 6.3). 

After these changes are made, would the original optimal solution, along with 
the new variable equal to zero (nonbasic), still be optimal for the primal problem? As 
for the preceding case, an equivalent question is whether the complementary basic 
solution for the dual problem is still feasible. And, as before, this question can be 
answered simply by checking whether this complementary basic solution satisfies one 
constraint, which in this case is the new constraint for the dual problem. 

To illustrate, suppose for the Wyndor Glass Co. problem of Sec. 3.1 that a 
possible third new product now is being considered for inclusion in the product line. 
Letting Xew represent the production rate for this product, the resulting revised model 
is shown as follows: 


Maximize Z = 3x, + 5x, + 4Xney, 


subject to xy + 2xinew t 


new 
2x. + 3Xpew = 12 
3x, + 2x3 + Xpey = 18 

and x, 20, x, = 0, Xnew = U: 


After introducing slack variables, the original optimal solution for this problem with- 
out Xp (see Table 4.8) was (xi, X2, X3, X4, X5) = (2, 6, 2, 0, 0). Is this solution, 
along with X ew = O, still optimal? 

To answer this question, check the complementary basic solution for the dual 
problem, which Table 6.9 (and Table 4.8) identifies as 


(Yi Yao Yao Z1 T Cys Z2 — C2) = (0, 3, 1, 0, 0). 


Since this solution was optimal for the original dual problem, it certainly satisfies the 
original dual constraints shown in Table 6.1. But does it satisfy the one new dual 
constraint, 


2y, + 3y, + y, = 4? 
Plugging in this solution, 
200) + 38) + d) 24 


is Satisfied, so this dual solution is still feasible (and thus still optimal). Consequently, 
the original primal solution (2, 6, 2, 0, 0), along with x,.,, = 0, is still optimal, so 
this third possible new product should not be added to the product line. 

This approach also makes it very easy to conduct sensitivity analysis on the 
coefficients of the new variable added to the primal problem. By simply checking the 
new dual constraint, you can immediately see how far any of these parameter values 
can be changed before they affect the feasibility of the dual solution and so the 
optimality of the primal solution. 


Other Applications 


Already we have discussed two other key applications of duality theory to sensitivity 
analysis, namely, shadow prices and the dual simplex method. As described in Secs. 
4.7 and 6.2, the optimal dual solution (y7, y3, . . . , y) provides the shadow prices 
for the respective resources that indicate how Z would change if (small) changes were 
made in the b; (the resource amounts). The resulting analysis will be illustrated in 
some detail in Sec. 6.7. 

In more general terms, the economic interpretation of the dual problem and of 
the simplex method presented in Sec. 6.2 provides some useful insights for sensitivity 
analysis. 

When we investigate the effect of changing the b; or the a; (for basic variables), 
the original optimal solution may become a superoptimal basic solution iustead (see 
Table 6.10). If you then want to reoptimize to identify the new optimal solution, the 
dual simplex method (discussed at the end of Secs. 6.1 and 6.3) should be applied, 
starting from this basic solution. 

We mentioned in Sec. 6.1 that it sometimes is more efficient to solve the dual 
problem directly by the simplex method in order to identify an optimal solution for 
the primal problem. When the solution has been found in this way, sensitivity analysis 
for the primal problem then is conducted by applying the procedure described in the 
next two sections directly to the dual problem and then inferring the complementary 
effects on the primal problem (e.g., see Table 6.11). This approach to sensitivity 
analysis is relatively straightforward because of the close primal-dual relationships 
described in Secs. 6.1 and 6.3. (See Prob. 47.) 


6.6 The Essence of Sensitivity Analysis 


The work of the operations research team usually is not even nearly done when the 
simplex method has been successfully applied to identify an optimal solution for the 
model. As we pointed out at the end of Sec. 3.3, one assumption of linear program- 


ming is that all the parameters of the model (the a;;, b;, and c;) are known constants. 


171 


Duality Theory and 
Sensitivity Analysis 


172 


Linear Programming 


Actually; the parameter values used in the model normally are just estimates based 
on a prediction. of future conditions. The data obtained to develop these estimates 
often are rather crude or nonexistent, so that the parameters in the original formulation 
may represent little more than quick rules of thumb provided by harassed line per- 
sonnel. They may even represent deliberate overestimates or underestimates to protect 
the interests of the estimators. 

Thus the successful manager and operations research staff will maintain a healthy 
skepticism about the original numbers coming out of the computer and will view them 
in many cases as only a starting point for further analysis of the problem. An ‘‘op- 
timal’ solution is optimal only with respect to the specific model being used to 
represent the real problem, and such a solution becomes a reliable guide for action 
only after it has been verified as performing well for other reasonable representations 
of the problem as well. Furthermore, the model parameters (particularly the b;) some- 
times are set as. a result of. managerial policy decisions (e.g., the amount of certain 
resources to be made available to the activities), and these decisions should be re- 
viewed after seeing their potential consequences. 

For these reasons it is important to perform sensitivity analysis to investigate 
the effect on the optimal solution provided by the simplex method if the parameters 
take on other possible values. Usually there will be some parameters that can be 
assigned any reasonable value without affecting the optimality of this solution. How- 
ever, there may also be parameters with likely alternative values that would yield a 
new optimal solution. This situation is particularly serious if the original solution 
would then have a substantially inferior value of the objective function, or perhaps 
even be infeasible! Therefore, the basic objective of sensitivity analysis is to identify 
these particularly sensitive parameters, so that special care can then be taken to esti- 
mate them more closely and to select a solution that performs well for most of their 
likely values. 

Sensitivity analysis would require an exorbitant computational effort if it were 
necessary to reapply the simplex method from the beginning to investigate each new 
change in a parameter value. Fortunately, the fundamental insight discussed in Sec. 
5.3 virtually eliminates computational effort.. The basic idea is that the fundamental 
insight immediately reveals just how. any changes in the original model would change 
the numbers in the final simplex tableau (assuming that the same sequence of algebraic 
operations originally performed by the simplex method were to be duplicated). There- 
fore, after making a few simple calculations to revise this tableau, we can check easily 
whether the original optimal basic feasible solution is now nonoptimal (or infeasible). 
If so, this solution would be used as the initial basic solution to restart the simplex 
method (or dual simplex method) to find the new optimal solution, if desired. If the 
changes in the model are not major, only a very few iterations should be required to 
reach the new optimal solution from this ‘‘advanced’’ initial basic solution. 

To describe this procedure more specifically, consider the following situation. 
The simplex method already has been used to obtain an optimal solution to a linear 
programming model with specified values for the b;, c;, and a, parameters. To initiate 
sensitivity analysis, one or more of the parameters now is changed. After making the 
changes, let b;, c;, and a;; denote the values of the various parameters. Thus, in matrix 
notation, 


for the revised model. 


The first step is to revise the final simplex tableau to reflect these changes. 
Continuing to use the notation presented in Table 5.9, as well as the accompanying 
formulas for the fundamental insight [(1) t* = t + y*T and (2) T* = S*T), the 
revised final tableau is calculated from y* and S* (which have not changed) and the 
new initial tableau, as shown in Table 6.17. 

To illustrate, suppose that the original model for the Wyndor Glass Co. problem 
of Sec. 3.1 is revised as shown at the right. 








Original model Revised model 

Ean Xi fie X% 

Maximize Z = [3, 5] y Maximize Z = [4, 5] ; 
Xa x 

subject to subject to 

1 0 % 4 1 0 x 4 

0 2 x, |= 12 0 2|| |) S| 24 

3 2 18 2 24b? 18 
and 

x20 

















Thus the changes from the original model are c) = 3 — 4, a3, = 3 —> 4, and b, = 
12 — 24. Figure 6.2 shows the graphical effect of these changes. For the original 
model, the simplex method already has identified the optimal cormer-point feasible 
solution as (2, 6), lying at the intersection of the two constraint boundaries shown as 
dashed lines, 2x, = 12 and 3x, + 2x, = 18. Now the revision of the model has 
shifted both of these constraint boundaries as shown by the dark lines, 2x, = 24 and 
2x, + 2x, = 18. Consequently, the previous corner-point feasible solution (2, 6) 
now shifts to the new intersection (— 3, 12), which is a corner-point infeasible solution 
for the revised model. The procedure described in the preceding paragraphs finds this 
shift algebraically (in augmented form). Furthermore, it does so in a manner that is 
very efficient even for huge problems where graphical analysis is impossible. 

To carry out this procedure, we begin by displaying the parameters of the revised 
model in matrix form: 


c = [4,5], 

= fao _ f4 

A=|0 2|, b=]24]. 
2 2 18 


Table 6.17 Revised Final Simplex Tableau Resulting from Changes in Original Model 



































Eq. Coefficient of Right 

No. Z Original Variables Slack Variables Side 
New initial 0 | 1 —€ 0 0 
tableau Em 0 x. i R 
Revised final 0 i 1 z= y*A -T y* Z* = y*b 
tableau Loa 0 ip 
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2x, = 24 











2x; + 2x, = 18 





0 2 4 6 8 Xi 


Figure 6.2 Shift of the final corner-point solution from (2, 6) to (—3, 12) for the revision of the 
Wyndor Glass Co. problem where c, = 3 > 4, a3, = 3 > 2, and b, = 12 — 24. 


The resulting new initial simplex tableau is shown at the top of Table 6.18. Below 
this tableau is the original final tableau (as first given in Table 4.8). We have drawn 
dark boxes around the portions of this final tableau that the changes in the model 
definitely do not change, namely, the coefficients of the slack variables in both row 
0 (y*) and the rest of the rows (S*). Thus, 


1 4-4 

y* = (0,31), S*=]0 2 olf 
0 a 1 
3 3 


These coefficients of the slack variables necessarily are unchanged with the same 
algebraic operations originally performed by the simplex method because the coeffi- 
cients of these same variables in the initial tableau are unchanged. 

However, because other portions of the initial tableau have changed, there will 
be changes in the rest of the final tableau as well. Using the formulas in Table 6.17, 


the revised numbers in the rest of the final tableau are calculated as follows: 


1 0 4 
z* —@ = (0,3, 1]]0 2] — [4,5] = [-2, 0], Z* = [0, 3, 1]| 24 | = 54, 
2 2 18 
1 -j o] [8 
A*=|0 2 ojlo 2ļ=ļo 1], 
0 -3 3ļ||2 2 $ 0 
1 4 -3| 4 6 
b*=|0 2 O0j|24]|=]| 12]. 
0 -3 3||18 =2 


The resulting revised final tableau is shown at the bottom of Table 6.18. 

Actually, we can substantially streamline these calculations for obtaining the 
revised final tableau. Because none of the coefficients of x, changed in the original 
model (tableau), none of them can change in the final tableau, so we can delete their 
calculation. Several other original parameters (a41, 431, b1, b3) also were not changed, 
so another shortcut is to calculate only the incremental changes in the final tableau in 
terms of the incremental changes in the initial tableau, ignoring those terms in the 
vector or matrix multiplication that involve zero change in the initial tableau. In 
particular, the only incremental changes in the initial tableau are Ac, = 1, Aaj, = 
—1, and Ab, = 12, so these are the only terms that need be considered. This stream- 
lined approach is shown below, where a zero or dash appears in each spot where no 
calculation is needed. 





0 = 
A@* — e) = y* AA — Ac = [0, 8, 1]] 0 Bea) See.) 
-—] —_ 
0 
AZ* = y* Ab = [0,3, 1]| 12 | = 18. 
0 
1 eee | 0 — D= 
AA* =S*AA=|]0 3 0 0 =|= 0 — 
o -$ 4||-1 — -3 — 
1 3 -|| 0 4 
Abt = S*Ab=]0 4 Off 12] = 6 |. 
o -4 Fi 0 —4 


Adding these increments to the original quantities in the final tableau (middle of Table 
6.18) then yields the revised final tableau (bottom of Table 6.18). 

This incremental analysis also provides a useful general insight, namely, that 
changes in the final tableau must be proportional to each change in the initial tableau. 
We illustrate in the next section how this property enables us to use linear interpolation 
or extrapolation to determine the range of values for a given parameter over which 
the final basic solution remains both feasible and optimal. 

After obtaining the revised final simplex tableau, we next convert the tableau 
to proper form from Gaussian elimination (as needed). In particular, the basic variable 
for row i must have a coefficient of 1 in that row and a coefficient of zero in every 
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Table 6.18 Obtaining the Revised Final Simplex Tableau for the Revised Wyndor 
Glass Co. Problem 































Coefficient of 
Variable Xz X3 
Z —4 -5 0 0 0 0 
New initial X3 1 0 1 0 0 4 
tableau 0 2 0 1 0 24 
2 2 0 0 1 18 
: 0 1 0 0 
Final tableau 1 0 0 0 1 F zł 
for original 2 0 0 1 0 4 0 
model 3 0 1 o fo -4 4 
0 1 2 0 0 3 1 54 
Revised final 1 0 4 0 1 $ -4 6 
tableau 2 0 0 1 0 4 0 12 
3 0 3 0 0 -} 4 =2 





other row (including row 0) for the tableau to be in the proper form for identifying 
and evaluating the current basic solution. Therefore, if the changes have violated this 
requirement (which can occur only if the original constraint coefficients of a basic 
variable have been changed), further changes must be made to restore this form. This 
restoration is done by using Gaussian elimination, i.e., by successively applying part 
3 of the iterative step for the simplex method (see Chap. 4) as if each violating basic 
variable were an entering basic variable. Note that these algebraic operations may also 
cause further changes in the right-side column, so that the current basic solution can 
be read from this column only when proper form from Gaussian elimination has been 
fully restored. 

For the example, the revised final simplex tableau shown in the top half of Table 
6.19 is not in proper form from Gaussian elimination because of the column for the 
basic variable x,. Specifically, the coefficient of x, in its row (row 3) is 3 instead of 
1, and it has nonzero coefficients (—2 and 4) in rows O and 1. To restore proper 
form, row 3 is multiplied by 3; then, 2 times this new row 3 is added to row 0 and 
3 of the new row 3 is subtracted from row 1. This yields the proper form from Gaussian 
elimination shown in the bottom half of Table 6.19, which now can be used to identify 
the new values for the current (previously optimal) basic solution, 


(x, Xa, X3, X4» Xs) aid (-3, 12, 7, 0, 0). 


Because x, is negative, this basic solution no longer is feasible. However, it is 
superoptimal (see Table 6.10) because all the coefficients. in row 0 still are non- 
negative. Therefore, the dual simplex method can be used to reoptimize (if desired), 
starting from this basic solution. Referring to Fig. 6.2 (and ignoring slack variables), 
the dual simplex method uses just one iteration to move from the corner-point solution 
(—3, 12) to the optimal comer-point feasible solution (0, 9). (It is often useful in 
sensitivity analysis to identify the solutions that are optimal for some set of likely 
values of the model parameters and then to determine which. of these solutions most 
consistently performs well for the various likely parameter values.) 


Table 6.19 ‘Converting the Revised Final Simplex Tableau to Proper Form from Gaussian 
Elimination for the Revised Wyndor Glass Co. Problem 











Basic Eq. Coefficient of 
Variable No. Xp x 

3 

Revised Z 0 1 2 0 0 i 
final 45 1 0 3 0 1 4 
tableau % 2 0 0 1 0 Ft 
x, 3 0 2 0 0 si 

Z 0 1 0 0 0 4 

Se a % g 0 0 0 1 ł 
on i %2 2 0 0 1 0 4 
x 3 0 1 0 0 —4 


If the basic solution (— 3, 12, 7, 0, 0) had been neither feasible nor superoptimal 
(i.e., if the tableau had negative entries in both the right-side column and row 0), 
artificial variables could have been introduced to convert the tableau to the proper 
form for an initial simplex tableau.! 

When testing to see how sensitive the original optimal solution is to the various 
parameters of the model, the common approach is to check each parameter individ- 
ually, changing its value from the initial estimate to other possibilities in the range 
of likely values (including the endpoints of this range). After the sensitive parameters 
have been identified, then some combinations of simultaneous changes of these pa- 
rameters may be investigated. Each time one (or more) of the parameters is changed, 
the procedure described and illustrated here would be applied. Let us now summarize 
this procedure. 


Summary of Procedure for Sensitivity Analysis 


1. Revision of model: Make the desired change or changes in the model to be 
investigated next. 

2. Revision of final tableau: Use the fundamental insight to determine the re- 
sulting changes in the final simplex tableau. 

3. Conversion to proper form from Gaussian elimination: Convert this tableau 
to the proper form for identifying and evaluating the current basic solution 
by applying (as necessary) Gaussian elimination. 

4. Feasibility test: Test this solution for feasibility by checking whether all its 
basic variable values in the right-side column of the tableau still are non- 
negative. 

5. Optimality test: Test this solution for optimality (if feasible) by checking 
whether all its nonbasic variable coefficients in row 0 of the tableau still are 
nonnegative. 

6. Reoptimization: If this solution fails either test, the new optimal solution can 
be obtained (if desired) by using the current tableau as the initial simplex 
tableau (making any necessary conversions) for the simplex method or dual 
simplex method. 


' There also exists a primal-dual algorithm that can be directly applied to such a simplex tableau without 
any conversion. 
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In the next section, we shall discuss. and illustrate the application of this pro- 
cedure to each of the major categories of revisions in the original model. This dis- 
cussion will involve, in part, expanding upon the example introduced in this section 
for investigating changes in the Wyndor Glass Co. model. In fact, we shall begin by 
individually checking each of the preceding changes. At the same time, we shall 
integrate some of the applications of duality theory to sensitivity analysis discussed 
in Sec. 6.5, 


6.7 Applying Sensitivity Analysis 


Sensitivity analysis often begins with the investigation of the effect of changes in the 
b;, the amount of resource i (i = 1, 2, . . . , m) being made available for the activities 
under consideration. The reason is that there generally is more flexibility in setting 
and adjusting these values than there is for the other parameters of the model. As 
already discussed in Secs. 4.7 and 6.2, the economic interpretation of the dual vari- 
ables (the y,) as shadow prices is extremely useful for deciding which changes should 
be considered. 


Case 1— Changes in the b, 


Suppose that the only changes in the current model are that one or more of the b; 
parameters (i = 1, 2,...,m) has been changed. In this case, the only resulting 
changes in the final simplex tableau are in the right-side column. Therefore, both the 
conversion to proper form from Gaussian elimination and the optimality test steps of 
the general procedure can be skipped. 


EXAMPLE: Sensitivity analysis is begun for the original Wyndor Glass Co. problem 
of Sec. 3.1 by examining the optimal values of the y; dual variables (yj = 0, y3 = 
3, y} = 1). These shadow prices give the marginal value of each resource i for the 
activities (two new products) under consideration. As discussed in Sec. 4.7 (see Fig. 
4.3), the total profit from these activities can be increased $1.50/minute for each 
additional unit of resource 2 (production: capacity in Plant 2) that is made available. 
This increase in profit holds for relatively small changes that do not affect the feasi- 
bility of the current basic solution (and so do not affect the values of the y*). 
Consequently, the OR Department has investigated the marginal profitability 
from the other current uses of this resource to determine if any are less than 
$1.50/minute. This investigation reveals that one old product is far less profitable. 
The production rate for this product already has been reduced to the minimum amount 
that would justify its marketing expenses. However, it can be discontinued altogether, 
which would provide an additional 12 units of resource 2 for the new products. Thus 
the next step is to determine what profit could be obtained from the new products if 
this shift were to be made. This shift changes b, from 12 to 24 in the linear program- 
ming model. Figure 6.3 shows the graphical effect of this change, including the shift 
in the final corner-point solution from (2, 6) to (—2, 12). (Note that this figure differs 
from Fig. 6.2 because the constraint, 3x, + 2x, = 18, has not been changed here.) 
When the fundamental insight (Table 6.17) is applied, the effect of this change 
on the original final simplex tableau (middle of Table 6.18) is found to be that the 
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3x, + 2x = 18 


0 2 4 6 8 xy 
Figure 6.3 Feasible region for the Wyndor Glass Co. problem after changing just b, to 24. 


entries in the right-side column change to the following values: 


4 
Z* = y*b = [0, $, 1]| 24 | = 54, 
18 
b* = Sb=]|0 4 Off/24) =] 12|, solx,| =] 12]. 
0 -3 4/18 =Z x} —2 


Equivalently, because the only change in the original model is Ab, = 24 — 12 = 
12, incremental analysis can be used to calculate these same values more quickly as 
follows: 


AZ* = #(12) = 18, so Z* = 36 + 18 = 54, 
Ab* = 402) = 4, sob*=24+4=6, 

Abt = 112) =6, sobt =6 +6 = 12, 

Abe = -4(12) = -4, sobs =2-4= -2, 


where the original values of these quantities are obtained from the right-side column 
in the original final tableau (middle of Table 6.18). The resulting revised final tableau 
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corresponds completely to this original final tableau except for replacing the right- 
side column with these new values. 
Therefore, the current (previously optimal) basic solution has become 


(%,, Xz, X3, X4» Xs) = (-2, 12, 6, 0, 0), 


which fails the feasibility test because of the negative value. The dual simplex method 
now can be applied, starting with this revised simplex tableau, to find the new optimal 
solution. This method leads in just one iteration to the new final simplex tableau 
shown in Table 6.20. (Alternatively, the simplex method could be applied from the 
beginning, which also would lead to this final tableau in just one iteration in this 
case.) This tableau indicates that the new optimal solution is 


(X1, X2, X3, Xy, X5) = (0, 9, 4, 6, 0), 


with Z = 45, thereby providing an increase in profit from the new products of 
$9/minute over the previous Z = 36. The fact that x, = 6 indicates that 6 of the 12 
additional units of resource 2 are unused by this solution. 

` Although Ab, = 12 proved to be too large an increase in b, to retain feasibility 
(and so optimality) with the basic solution where x,, x2, and x3 are the basic variables 
(middle of Table 6.18), the above incremental analysis shows immediately just how 
large an increase is feasible. In particular, note that 


bi = 2 + 4Ab,, 
be = 6+ 4Ab,, 
bey = 2 — $ Abū, 


where these three quantities are the values of x3, x, and x,, respectively, for this 
basic solution. The solution remains feasible, and so optimal, as long as all three 
quantities remain nonnegative. To determine the range of values of b, over which all 
three quantities remain nonnegative, set each quantity to zero, solve for Ab,, and 
choose the positive and negative values of Ab, closest to zero, namely, 


—6 = Ab, = 6, or 6 =< b, = 18. 


Thus, 6 = b, = 18 is the allowable range for b, over which the original final basic 
solution (with new values for the basic variables) remains feasible, and so optimal, 


Table 6.20 Revised Data for Wyndor Glass Co. Problem after Changing Just b, 


Final Simplex Tableau after Reoptimization 








Model parameters Basic EOE HL 
Variable 
c = 3,6 =5 (n= 2) z 
au = l,a = 0,b,= 4 x3 
ay, = 0, Gy = 2, b, = 24 Xp 
a3, = 3, a = 2, b; = 18 X4 








as long as this is the only change in the original model. (Many linear programming 
software packages use this technique for automatically generating the allowable range 
for each b;, and a similar technique for each c,.) 

Based on the results with b, = 24, the relatively unprofitable old product will 
be discontinued and the unused 6 units of resource 2 will-be saved for some future 
use. Since y3 still is positive, a similar study is made of the possibility of changing 
the allocation of resource 3, but the resulting decision is to retain the current allocation. 
Therefore, the current linear programming model at this point has the parameter values 
and optimal solution shown in Table 6.20. 


Case 2a—Changes in the Coefficients of a Nonbasic Variable 


Consider a particular variable x, (fixed j) that is a nonbasic variable in the optimal 
solution shown by the final simplex tableau (so that x; is not included in the list of 
basic variables in the first column of this tableau). Case 2a is where the only changes 
in the current model are that one or more of the coefficients of this variable —c,, Ajj, 
Qj, - + + » Amj—have been changed. 

As described at the beginning of Sec. 6.5, duality theory provides a very con- 
venient way of checking these changes. In particular, if the complementary basic 
solution y* in the dual problem still satisfies the single dual constraint that has 
changed, then the original optimal solution in the primal problem remains optimal as 
is. Conversely, if y* violates this dual constraint, then this primal solution is no longer 
optimal. 

If the optimal solution has changed and you wish to find the new one, you can 
do so rather easily. Simply apply the fundamental insight to revise the x; column (the 
only one that has changed) in the final simplex tableau. With the current basic solution 
no longer optimal, the new value of (zi — c;) now will be the one negative coefficient 
in row 0, so restart the simplex method with x; as the initial entering basic variable. 

Note that this procedure is a streamlined version of the general procedure sum- 
marized at the end of Sec. 6.6. Steps 3 and 4 (conversion to proper form from 
Gaussian elimination and feasibility test) have been deleted as irrelevant, because the 
only column being changed in the revision of the final tableau (before reoptimization) 
is for the nonbasic variable x;. Step 5 (optimality test) has been replaced by a quicker 
test of optimality to be performed right after step 1 (revision of model). It is only if 
this test reveals that the optimal solution has changed, and you wish to find the new 
one, that steps 2 and 6 (revision of final tableau and reoptimization) are needed. 


EXAMPLE: Since x, is nonbasic in the current optimal solution (see Table 6.20) for 
the Wyndor Glass Co. problem, the next step in its sensitivity analysis is to check 
whether any reasonable changes in the estimates of the coefficients of x, could still 
make it advisable to introduce product 1. The set of changes that goes as far as 
realistically possible to make product 1 more attractive would be to reset c} = 4 and 
a3, = 2 (as was done in Sec. 6.6). 

This change in a3, revises the feasible region from that shown in Fig. 6.3 
to the corresponding region in Fig. 6.2 when 3x, + 2x, = 18 is replaced by 2x, + 
2x. = 18. (Ignore the 2x, = 12 line, because the 2x, = 12 constraint already has 
been replaced by 2x, = 24.) The change in c, revises the objective function from 
Z = 3x, + 5x, to Z = 4x, + 5x,. By using Fig. 6.2 to draw the objective function 
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line, Z = 45 = 4x, + 5x), through the current optimal solution (0, 9), you can 
verify that (0, 9) remains optimal after these changes in az; and c}. 

To use duality theory to draw this same conclusion, observe that the changes 
in c; and a}; lead to a single revised constraint for the dual problem (see Table 6.1). 
Both this revised constraint and the current y* (coefficients of the slack variables in 
row 0 of Table 6.20) are shown below. 

y+ 3y=3 > yy t+ 29324, 
y=0, ye =0, j= 
Note that y* still satisfies the revised constraint, so the current primal solution (Table 
6.20) is still optimal. 

This approach of examining the revised: dual constraint makes it easy to see just 
how much the parameters involved can be changed before the current optimal solution 
would become nonoptimal. For example, with a, = 2, so that yj + 2y3 = 5, the 
allowable range for c, without changing the optimal solution is c, = 5. Similarly, 
with the original value of a3, (a3, = 3), so that yf + 3y3 = , the allowable range 
is c, =, so c; can be increased by as much as $ (Ac, < 3) above its original value 
of c, = 3. This latter allowable range also can be obtained directly from Table 6.20, 
which gives 3 as the coefficient of x, in row 0 of the final tableau. When the only 
parameter change from Table 6.20 is an increase in c,, Table 6.17 indicates that the 
only resulting change in the final tableau is that this coefficient becomes Ẹ — Ac), so 
Ac, = $ is the allowable change permitted by the optimality test, and c} <3 + 3 = 
37 is the allowable range for c,. (This is the method used by many linear programming 
software packages to obtain the allowable range for the c; corresponding to nonbasic 
variables. ) 

Because any larger changes in the original estimates of the coefficients of x, 
would be unrealistic, the OR Department concludes that these coefficients are insen- 
sitive parameters in the current model. Therefore, they will be kept fixed at their best 
estimates shown in Table 6.20, c, = 3 and a3, = 3, for the remainder of the sensitivity 
analysis. 


Case 2b—Introduction of a New Variable 


After solving for the optimal solution, we may discover that the linear programming 
formulation did not consider all the attractive alternative activities. Considering a new 
activity requires introducing a new variable with the appropriate coefficients. into the 
objective function and constraints of the current model—which is Case 2b. 

The convenient way to deal with this case is to treat it just as if it were Case 
2a! This is done by pretending that the new variable x; actually was in the original 
model with all its coefficients equal to zero (so that they still are zero in the final 
simplex tableau) and that x; is a nonbasic variable in the current basic feasible solution. 
Therefore, if we change these zero coefficients. to their actual values for the new 
variable, the procedure (including any reoptimization) does indeed become identical 
to that for Case 2a. 

In particular, all you have to do to check whether the current solution still is 
optimal is check whether the complementary basic solution y* satisfies the one new 
dual constraint that corresponds to the new variable in the primal problem. We already 
have described this approach and then illustrated it for the Wyndor Glass Co. problem 
in Sec. 6.5. 


Case 3— Changes in the Coefficients of a Basic Variable 


Now suppose that the variable x; (fixed j) under consideration is a basic variable in 
the optimal solution shown by the final simplex tableau (so x, appears in the first 
column of this tableau). Case 3 assumes that the only changes in the current model 
are in the coefficients of this variable. 

Case 3 differs from Case 2a because of the requirement that a simplex tableau 
be in proper form from Gaussian elimination. This requirement allows the column 
for a nonbasic variable to be anything, so it does not affect Case 2a. However, for 
Case 3, it requires that the basic variable x; must have a coefficient of 1 in its row of 
the simplex tableau and a coefficient of zero in every other row (including row 0). 
Therefore, after the changes in the x, column of the final simplex tableau have been 
calculated,! it probably will be necessary to apply Gaussian elimination to restore this 
form, as was illustrated in Table 6.19. This step in turn probably will change the 
value of the current basic solution and may make it either infeasible or nonoptimal. 
Consequently, all the steps of the overall procedure summarized at the end of Sec. 
6.6 are required for Case 3. 


EXAMPLE: Because x, is a basic variable in Table 6.20 for the Wyndor Glass Co. 
problem, sensitivity analysis of its coefficients fits Case 3. Given the current optimal 
solution (x, = 0, x, = 9), product 2 is the only new product that should be introduced, 
and its production rate should be relatively large. Therefore, the key question now is 
whether the initial estimates that led to the coefficients of x, in the current model 
could have overestimated the attractiveness of product 2 so much that they invalidate 
this conclusion. This question can be tested by checking the most pessimistic set of 
reasonable estimates for these coefficients, which turns out to be c, = 3, ad) = 3, 
and a3, = 4. 

The graphical effect of these changes is that the feasible region changes from 
the one shown in Fig. 6.3 to the one in Fig. 6.4. The optimal solution in Fig. 6.3 is 
(xi; x2) = (0, 9), which is the corner-point solution lying at the intersection of the 
x, = Oand 3x, + 2x, = 18 constraint boundaries. With the revision of the constraints, 
the corresponding corner-point solution in Fig. 6.4 is (0, $). However, this solution 
no longer is optimal, because the revised objective function of Z = 3x, + 3x, now 
yields a new optimal solution of (x,, x2) = (4, 3), 

Now let us see how we draw these same conclusions algebraically. Because the 
only changes in the model are in the coefficients of x,, the only resulting changes in 
the final simplex tableau (Table 6.20) are in the x, column. Therefore, the formulas 
in Table 6.17 are used to recompute just this column. 


— 0 
z — € = y*A* — € = [0, 0, 3]/ — 3| — [—. 3] = [-. 7]. 
ai A 
= 1 0 oOo= o0 a 6 
A*=S*A=|[0 0 4]1/— 3]/=]— 2 
0 1 -1j/- 4 = 1 


1 For the relatively sophisticated reader, we should point out a possible pitfall for Case 3 that would be 
discovered at this point. Specifically, the changes in the initial tableau can destroy the linear independence 
of the columns of coefficients of basic variables. This event occurs only if the unit coefficient of the basic 
variable x; in the final tableau has been changed to zero at this point, in which case more extensive simplex 
method calculations must be used for Case 3. 
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Figure 6.4 Feasible 
region for the Case 3 
example. 
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x, =0 


14 


12. }-------------+--------- — 2x, = 24 


x, =4 
10 








3x, = 24 

3x, + 2x, = 18 
x 3x, + 4x. = 18 
N, 3 
(a; 5 optimal) 

\ 
\ Xn = 0 
6 8 mal 
(Equivalently, incremental analysis with Ac, = —2, Ady, = 1, and Aa, = 2 can 


be used in the same way to obtain this column.) 

The resulting revised final tableau is shown at the top of Table 6.21. Note that 
the new coefficients of the basic variable x, do not have the required values, so the 
conversion-to-proper-form-from-Gaussian-elimination step must be applied next. This 
step involves dividing row 2 by 2, subtracting 7 times the new row 2 from row 0, 
and adding the new row 2 to row 3. 

The resulting second tableau in Table 6.21 gives the new value of the current 
basic solution, namely, x, = 4, x, = $, x, = 4 (x, = 0, x; = 0). Since all these 
variables are nonnegative, the solution is still feasible. However, because of the 
negative coefficient of x, in row 0, we know that it is no longer optimal. Therefore, 
the simplex method would be applied to this tableau, with this solution as the initial 
basic feasible solution, to find the new optimal solution. The initial entering basic 
variable is x,, with x, as the leaving basic variable. Just one iteration is needed in 
this case to reach the new optimal solution: x, = 4, x. = ł, x4 = ¥ @ = 0, 
x; = 0), as shown in the last tableau of Table 6.21. 

Now look again in Fig. 6.4 at this new optimal solution (4, 3). This solution is 
optimal under the current pessimistic estimate that c, = 3, so Z = 3x, + 3x. 
However, under the original estimate that c, = 5, the solution (0, 3) would be 


Table 6.21 Sensitivity Analysis Procedure Applied to Case 3 Example 
























































Coefficient of 
Variable No. xy Xz X3 X4 Xs 

Z T 0 3 7 0o 0 8 45 

? X3 1 1 0 1 0 0 4 
Revised final tableau è 2 3 2 0 0 4 9 
x4 3 -3 -1 0 1 -1 6 
P i Z 0 -4 0 o o0 3 y 
onverte: t 1 1] 0 1 0 0 4 
to proper Xa 2 a1 0 o @| 8 
form P 3 -4 0 0o 1 -4 3 
New final tableau 33 
after reoptimization F l 
(only one iteration of k 3 
2 2 

ag 


the simplex method 
needed in this case) 


optimal. Because of the constraint line, 3x, + 4x, = 18, the crossover point from 
one optimal solution to the other is at c, = 4. If c, were less than 3, then (4, 3) 
remains optimal as long as c, = 0. Therefore, the allowable range for c, without 
changing the optimal solution (4, 3) is 0 = c, = 4. 

Table 6.22 shows how this allowable range for c, is calculated algebraically, 
where only the relevant portions of the simplex tableaux (row O and the row for xz) 
are displayed. The starting point (first tableau displayed) is the final tableau from the 
bottom of Table 6.21. The key then is to apply steps 1, 2, 3, and 5 of the sensitivity 
analysis procedure (Sec. 6.6) when c, = 3 is increased or decreased by a small 
amount, where Ac, = +1 is the convenient amount. The second and third tableaux 
in Table 6.22 show the effect of Ac, = 1, whereas the fourth and fifth tableaux repeat 
the procedure for Ac, = —1. (Note that the second and fourth tableaux differ from 
the first only in that the coefficient of x, in row 0 changes from 0 to —Ac,.) With 


Table 6.22 Obtaining Allowable Range for c, without Changing New Optimal Solution 
for Case 3 Example 












































Basic Eq. Coefficient of = Right 

Variable No. Z xy X Xz X4 Xs Side 
: Zz 0 1 0 0 3 0 3 E 
Final tableau S 2 0 0 1 -3 0 4 3 
Revised final tableau Z 0 1 0 -1 2 0 3 z 
when Ac, = 1 X 2 o | 0 1 -4 0 + 3 
Converted to proper Z 0 1 0 0 0 0 1 [ 18 
form x2 2 0 0 1 3 0 ł 3 
Revised final tableau Z 0 1 3 0 į 3 
when Ac, = —1 xz 0 1 -4 0 4 3 
Converted to proper Z 0 0 3 0 ł 15 
form 0 1 3 0 ł 3 
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Ac, = 1, the coefficient of x, in row 0 decreases from 3 (first tableau) to O (third 
tableau), which indicates that Ac, = 1 is the crossover point above which the current 
basic solution no longer would be optimal. With Ac, = —1, the only number in row 
O that decreases is the coefficient of x;. The decrease is from 2 (first tableau) to $ 
(fifth tableau), which is only one-third of the way to 0. By linear extrapolation, Ac, 
= —3 thereby is the crossover point below which the current basic solution no longer 
is optimal. Therefore, —3 = Ac, = 1 (or 0 = c, = 4) is the allowable range without 
changing the optimal solution. 

All of this analysis suggests that c2, a33, and a3, are relatively sensitive param- 
eters. However, additional data for estimating them more closely can be obtained only 
by conducting a pilot run. Therefore, the OR Department recommends that production 
of product 2 be initiated immediately on a small scale (x, = 3) and that this experience 
be used to guide the decision on whether the remaining production capacity. should 
be allocated to product 2 or product 1. 


Case 4—Introduction of a New Constraint 


The last case is one in which a new constraint must be introduced into the model after 
it has already been solved. This case may occur because the constraint was overlooked 
initially or because new considerations have arisen since the model was formulated 
originally. Another possibility is that the constraint was deleted purposely to decrease 
computational effort because it appeared to be less restrictive than other constraints 
already in the model, but now this impression needs to be checked with the optimal 
solution actually obtained. 

To see if the current. optimal solution would be affected by a new constraint, 
all you have to do is check directly whether the optimal solution satisfies the constraint. 
If it does, then it would still be the best feasible solution (i.e., the optimal solution), 
even if the constraint were added to the model. The reason is that a new constraint 
can only eliminate some previously feasible solutions without adding any new ones. 

If the new constraint does eliminate the current optimal solution, and if you 
want to find the new solution, then introduce this constraint into the final simplex 
tableau (as an additional row) just as if this were the initial tableau, where the usual 
additional variable (slack variable or artificial variable) is designated to be the basic 
variable for this new row. Because the new row probably will have nonzero coeffi- 
cients for some of the other basic variables, the conversion-to-proper-form-from-Gaus- 
sian-elimination step is applied next, and then the reoptimization step is applied in 
the usual way. 

Just as for some of the preceding cases, this procedure for Case 4 is a streamlined 
version of the general procedure summarized at the end of Sec. 6.6. The only question 
to be addressed for this case is whether the previously optimal solution still is feasible, 
so step 5 (optimality test) has been deleted. Step 4 (feasibility test) has been replaced 
by a much quicker test of feasibility (does the previously optimal solution satisfy the 
new constraint?) to be performed right after step 1 (revision of model). It is only if 
this test provides a negative answer, and you wish to reoptimize, that steps 2, 3, and 
6 are used (revision of final tableau, conversion to proper form from Gaussian elim- 
ination, and reoptimization). 


EXAMPLE: To illustrate this case, suppose that the new constraint, 


2x, + 3x, = 24, 


2x, = 24 










(0, 8) optimal 


2x, + 3x2 = 24 


xX, =0 


0 2 4 6 8 10 12 14 xy 
Figure 6.5 Feasible region for the Case 4 example. 





is introduced into the model given in Table 6.20. The graphical effect is shown in 
Fig. 6.5. The previous optimal solution (0, 9) violates the new constraint, so the 
optimal solution changes to (0, 8). 

To analyze this example algebraically, note that (0, 9) yields 2x; + 3x, = 
27 > 24, so this previous optimal solution is no longer feasible. To find the new 
optimal solution, add the new constraint to the current final simplex tableau as just 
described, with the slack variable x, as its initial basic variable. This step yields the 
first tableau shown in Table 6.23. The conversion-to-proper-form-from-Gaussian- 
elimination step then requires subtracting three times row 2 from the new row, 
which identifies the current basic solution: x, = 4, x. = 9, x4 = 6,%, = —3 
(x, = 0, xs = 0), as shown in the second tableau. Applying the dual simplex 
method to this tableau then leads in just one iteration (more are sometimes needed) to 
the new optimal solution in the last tableau of Table 6.23. 


Systematic Sensitivity Analysis—Parametric Programming 


So far we have described how to test specific changes in the model parameters. 
Another common approach to sensitivity analysis is to vary one or more parameters 
continuously over some interval(s) to see when the optimal solution changes. 

For example, with the Wyndor Glass Co. problem, rather than beginning by 
testing the specific change from b, = 12 to b, = 24, we might instead set 


b, = 12 + 9, 
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Table 6.23 Sensitivity Analysis Procedure Applied to Case 4 Example 






Coefficient of 




































Basic 
Variable Zz Side 
z 0 1 2 0 0 0 3 0 45 
X 1 0 1 0 1 0 0 0 4 
Revised final tableau X2 2 0 3 1 0 0 fA 0 9 
X4 3 0 —3 0 0 1 —1 0 6 
Xg New 0 2 3 0 0 0 1 24 
Z 0 1 2 0 0 0 ł 0 45 
X3 1 0 1 0 1 0 0 0 4 
oe to proper es 2 0 3 1 0 0 4 0 9 
X4 3 0 -3 0 0 1 -1 0 6 
Xs New | 0 -4 0 0 0 -} 1 -3 
New final tableau Z 0 1 ł 0 0 0 0 3 40 
after reoptimization X3 1 0 1 0 1 0 0 0 4 
(only one iteration of Xa 2 0 $ i 0 0 0 ł 8 
dual simplex method X4 3 0 -$ 0 0 1 0. =$ 8 
N 0 5 0 oO 0 L: =$ 2 


needed in this case) Xs ew 


and then vary 0 continuously from 0 to 12 (the maximum value of interest). The 
geometric interpretation in Fig. 6.3 is that the 2x, = 12 constraint line is being shifted 
upward to 2x, = 12 + 0, with 0 being increased from 0 to 12. The result is that the 
original optimal corner-point feasible solution (2, 6) shifts up the 3x, + 2x5 = 18 
constraint line toward (— 2, 12). This corner-point solution remains optimal as long 
as it is still feasible (x, = 0), after which (0, 9) becomes the optimal solution. 

The algebraic calculations of the effect of having Ab, = 6 are directly analogous 
to those for the Case 1 example where Ab, = 12. In particular, by using the expres- 
sions for Z* and b* given in Table 6.17, the middle tableau in Table 6.18 indicates 
that the corresponding optimal solution is 


Z = 36 + 36 
x = 2+ 40 
eee (x, = 0, x; = 0) 
x, = 2 —- 36 


for 0 small enough that this solution still is feasible, i.e., for 6 = 6. For 6 > 6, the 
dual simplex method yields the tableau shown in Table 6.20, with Z = 45, x, = 4, 
X, = 9, butx, = —6 + 0 (x; = 0, x; = 0). This information can then be used 
(along with other data not incorporated into the model on the effect of increasing b,) 
to decide whether to retain the original optimal solution and, if not, how much to 
increase bp. 

In a similar way, we can investigate the effect on the optimal solution of varying 
several parameters simultaneously. When we vary just b; parameters, we express the 


new value b; in terms of the original value b; as follows: 
b; = b; + að, fori = 1,2,...,m, 


where the a; are input constants specifying the desired rate of increase (positive or 
negative) of the corresponding right-hand side as 6 is increased. 


For example, suppose that it is possible to shift some of the production of a 
current Wyndor Glass Co. product from Plant 2 to Plant 3, thereby increasing b, by 
decreasing b}. Also suppose that b, decreases twice as fast as b, increases. Then 


b= 12+ @ 

b, = 18 — 29, 
where the (nonnegative) value of @ measures the amount of production shifted. (Thus 
a, = 0, a, = 1, and a; = —2 in this case.) Referring to Fig. 6.3, the geometric 
interpretation is that, as @ is increased from 0, the 2x, = 12 constraint line is being 
pushed up to 2x, = 12 + @ (ignore the 2x, = 24 line) and simultaneously the 
3x, + 2x, = 18 constraint line is being pushed down to 3x, + 2x, = 18 — 26. 
The original optimal corner-point feasible solution (2, 6) lies at the intersection of the 
2x, = 12 and 3x, + 2x, = 18 lines, so shifting these lines causes this corner-point 
solution to shift. However, with the objective function of Z = 3x, + 5x, this corner- 
point solution will remain optimal as long as it is still feasible (x, = 0). 

An algebraic investigation of simultaneously changing b, and b, in this way 
again involves using the formulas in Table 6.17 to calculate the resulting changes in 
the final tableau (middle of Table 6.18). The only quantities that can change are in 
the right-side column, namely, ` 


4 
Z* = y*b = [0, ł,1]| 12 + 0 | = 36 — 30, 
18 — 26 
i 3-3 4 2+ 06 
be = S*b*=/0 2 O//12+ 06 |= |6+ 20 
0 -3 3 || 18 — 20 Z= 
Therefore, the optimal solution becomes 
Z = 36 — 36 
x =2 +90 
x = 6 +40 
4, = 2 oe 6 


(x, = 0, x; = 0) 


for 0 small enough that this solution still is feasible, i.e., for 0 = 2. (Check this 
conclusion in Fig. 6.3.) However, the fact that Z decreases as 9 increases from 0 
indicates that the best choice for 0 is @ = 0, so none of the possible shifting of 
production should be done. l 

The approach to varying several c; parameters simultaneously is similar. In this 
case, we express the new value c; in terms of the original value of c, as follows: 


& = c; + a0, forj = 1,2,...,7, 


where the a; are input constants specifying the desired rate of increase (positive or 
negative) of c, as 0 is increased. 

To illustrate this case, reconsider the sensitivity analysis of c, and c, for the 
Wyndor Glass Co. problem that was performed earlier in this section. Starting with 
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the version of the model presented in Table 6.20 and Fig. 6.3, we separately consid- 
ered the effect of changing c, from 3 to 4 (its most optimistic estimate) and c, from 
5 to 3 (its most pessimistic estimate). Now we can simultaneously consider. both 
changes, as well as various intermediate cases with smaller changes, by setting 


c¢, = 3+ 8, Co = 5 — 20, 


where the value of 0 measures the fraction of the maximum possible change that is 
made. The result is to replace the original objective function, Z = 3x, + 5x, by a 
function of 6, 


Z(0) = (3 + O)x, + (5 — 26)x>, 


so the optimization now can be performed for any desired (fixed) value of 6 between 
0 and 1. By checking the effect as @ increases from 0 to 1, we can determine just 
when and how the optimal solution changes as the error in the original estimates of 
these parameters increases. 

Considering these changes simultaneously is especially appropriate if there are fac- 
tors that cause the parameters to change together. Are the two products competitive in 
some sense, so that a larger-than-expected unit profit for one implies a smaller-than- 
expected unit profit for the other? Are they both affected by some exogenous factor, 
such as the advertising emphasis of a competitor? Is it possible to simultaneously 
change both unit profits through appropriate shifting of personnel and equipment? 

Referring to the feasible region shown in Fig. 6.3, the geometric interpretation 
of changing the objective function from Z = 3x, + 5x, to Z(0) = G + Ox + 
(5 — 26)x, is that we are changing the slope of the original objective function line, 
Z = 45 = 3x, + 5x), that can be drawn through the optimal solution (0, 9). If 0 is 
increased enough, this slope will change sufficiently that the optimal solution will 
switch from (0, 9) to another corner-point feasible solution (4, 3). (Check graphically 
whether this occurs for 0 = 1.) 

The algebraic procedure for dealing simultaneously with these two changes, 
Ac, = 0 and Ac,= —20, is similar to that shown in Table 6.22 (which dealt with 
Ac, = +1 for another version of the model). Although the changes now are expressed 
in terms of @ rather than being specific numerical amounts, 0 is treated just like an 
unknown number. Displaying just the relevant rows of the tableaux involved (row 0 
and the row for the basic variable x,), the results of this procedure are shown in Table 
6.24. The first tableau shown is just the final tableau for the current version of the 
model (before changing c, and c,) as given in Table 6.20. Referring to the formulas 
in Table 6.17, the only changes in the revised final tableau shown next are that Ac, 


Table 6.24 Dealing with Ac, = @ and Ac, = —26 for the Model of Table 6.20 












Coefficient of Right 


Side 


Basic 
Variable 








Final tableau 





Revised final tableau when 
Ac, = @and Ac, = —20 





Converted to proper form 






and Ac, are subtracted from the row 0 coefficients of x, and x, respectively. To 
convert this tableau to proper form from Gaussian elimination, we subtract 26 times 
row 2 from row 0, which yields the last tableau shown. The expressions in terms of 
6 for the coefficients of nonbasic variables (x, and xs) in row 0 of this tableau show 
that the current basic feasible solution remains optimal for 0 = 3. Because 0 = 1 is 
the maximum realistic value of 0, this indicates that c, and c, together are insensitive 
parameters with respect to the model of Table 6.20. There is no need to try to estimate 
these parameters more closely unless other parameters change (as occurred for the 
Case 3 example). 

As we discussed in Sec. 4.7, this way of continuously varying several parameters 
simultaneously is referred to as parametric linear programming. Section 9.3 presents 
the complete parametric linear programming procedure (including identifying new 
optimal solutions for larger values of 0) when just c, parameters are being varied and 
then when just b, parameters are being varied. Some linear programming software 
packages also include routines for varying just the coefficients of a single variable or 
varying just the parameters of a single constraint. In addition to the other applications 
discussed in Sec. 4.7, these procedures provide a convenient way of conducting 
sensitivity analysis systematically. 


6.8 Conclusions 


Every linear programming problem has associated with it a dual linear programming 
problem. There are a number of very useful relationships between the original (primal) 
problem and its dual problem that enhance our ability to analyze the primal problem. 
For example, the economic interpretation of the dual problem gives shadow prices 
that measure the marginal value of the resources in the primal problem, as well as 
providing an interpretation of the simplex method. Because the simplex method can 
be applied directly to either problem in order to solve both of them simultaneously, 
considerable computational effort sometimes can be saved by dealing directly with 
the dual problem. Duality theory, including the dual simplex method for working with 
superoptimal basic solutions, also plays a major role in sensitivity analysis. 

The values used for the parameters of a linear programming model generally 
are just estimates. Therefore, sensitivity analysis needs to be performed to investigate 
what happens if these estimates are wrong. The fundamental insight of Sec. 5.3 
provides the key for performing this investigation efficiently. The general objectives 
are to identify the sensitive parameters that affect the optimal solution, to try to 
estimate these sensitive parameters more closely, and then to select a solution that 
remains good over the range of likely values of the sensitive parameters. This analysis 
is a very important part of most linear programming studies. 
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PROBLEMS 


\v 1] Construct the primal-dual table and the dual problem for each of the following linear 
programming models fitting our standard form: 
Se odel given in Prob. 3 of Chap. 4. 

(b) Model given in Prob. 9 of Chap. 4. 


2.* Construct the dual problem for each of the following linear programming models 
fitting our standard form: 


(a) Model given in Prob. 4 of Chap. 3. 
(b). Model given in Prob. 7 of Chap. 4. 


3. Consider the linear programming model given in Prob. 12 of Chap. 4. 
(a) Construct the primal-dual table and the dual problem for this model. 
(b) What does the fact that Z is unbounded for this model imply about its dual problem? 


4. For each of the following linear programming models, give your recommendation on 
the more efficient way (probably) for obtaining an optimal solution: (1) applying the simplex 


method directly to this primal problem or (2) applying the simplex method directly to the dual 
problem instead. Explain. 


(a) Maximize Z = 10x, — 4x, + 7x3, 
subject to 3x, — x, + 2x, 25 
x — 2x, + 3x, S 25 
Sx, + x, + 2x, = 40 
xt xX t+ x3= 90 
2x, — xX, + x, = 20 
and x, 20, x, = 0, x, 20. 
(b) Maximize Z = 2x, + 5x, + 3x, + 4x4 + Xs, 
subject to xX, + 3x, + 2x3 + 3x4 + x55 6 


4x, + 6x, + 5x3 + 7x, + x; = 15 


and x, 20, forj = 1, 2, 3, 4, 5. 
5. Consider the following problem. 
Maximize Z = =x, — 2X2 — X3, 
subject to xX, + x, + 2x51 
2x, =: x,=1 
and x, 20, x, = 0, x, 20. 


(a) Construct the dual problem. 


(b) Use duality theory to show that the optimal solution to the primal problem has 
Z=0. 


6. Consider the simplex tableaux for the Wyndor Glass Co. problem given in Table 
4.8. For each tableau, give the economic interpretation of the following items: 
(a) Each of the coefficients of the slack variables (x3, x4, xs) in row 0. 
(b) Each of the coefficients of the original variables (x,, x») in row 0. 
(c) The resulting choice for the entering basic variable (or the decision to stop after the 
- final tableau). 


7 * Consider the following problem. 


= Maximize Z = 6x, + 8x, 
subject to 5x, + 2x, = 20 
x, + 2x, = 10 
and FA x 20, x, = 0. 


@ Construct the dual problem for this primal problem. 

(b) Solve both the primal problem and the dual problem graphically. Identify the corner- 
point feasible solutions and corner-point infeasible solutions for both problems. 

i Calculate the objective function values for all these solutions. 
4 (c) Use the information obtained in part (b) to construct a table listing the complementary 

““ basic solutions and so forth for these problems. (Use the same column headings as 
for Table 6.9.) 

(d) Solve the primal problem by the simplex method. After each iteration (including 
iteration 0), identify the basic feasible solution for this problem and the comple- 
mentary basic solution for the dual problem. Also identify the corresponding corner- 
point solutions. 


8. Consider the model with two functional constraints and two variables given in Prob. 
2 of Chap. 4. Follow the instructions of Prob. 7 above for this model. 


9. Consider the primal and dual problems for the Wyndor Glass Co. example given in 
Table 6.1. Using Tables 5.5, 5.6, 6.8, and 6.9, construct a new table giving the eight sets of 
nonbasic variables for the primal problem in column 1, the corresponding sets of associated 
variables for the dual problem in column 2, and the set of nonbasic variables for each comple- 
mentary basic solution of the dual problem in column 3. Explain why this table demonstrates 
the complementary slackness property for this example. 


10. Suppose that a primal problem has a degenerate basic feasible solution (one or more 
basic variables equal to zero) as its optimal solution. What does this degeneracy imply about 
the dual problem? Why? Is the converse also true? 


11. Consider the following problem. 
Maximize Z = 2x, — 4%, 
subject to xXy—%X%,=1 
and x, 20, x, = 0. 


(a) Construct the dual problem, and then find its optimal solution by inspection. 

(b) Use the complementary slackness property and the optimal solution for the dual 
problem to find the optimal solution for the primal problem. 

(c) Suppose that c,, the coefficient of x, in the primal objective function, actually can 
have any value in the model. For what values of c, does the dual problem have no 
feasible solutions? For these values, what does duality theory then imply about the 
primal problem? 
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12. 
subject to 
and 

(a) 


(b) 
(c) 


(d) 


13. 
(a) 
(b) 
(c) 


(d) 


(e) 


14. 
(a) 
(b) 


(c) 


(d) 
15. 
(a) 
(b) 


(c) 


(a) 


16. 


Consider the following problem. 
Maximize Z = 2x, + 7x, + 4x, 
xX; + 2x, + x; = 10 
3x, + 3x, + 2x3 = 10 
x, 20, x, = 0, x, = Q. 


Construct the dual problem for this primal problem. 

Use the dual problem to demonstrate that the optimal value of Z for the primal 
problem cannot exceed 25. 

It has been conjectured that x, and x, should be the basic variables for the optimal 
solution of the primal problem. Show that this conjecture is not true by directly 
deriving this basic solution (and Z) using Gaussian elimination (see Appendix 4). 
Simultaneously derive and identify the complementary basic solution for the dual 
problem. 

Solve the dual problem graphically. Use this solution to identify the basic variables 
and the nonbasic variables for the optimal solution of the primal problem. Directly 
derive this solution, using Gaussian elimination. 


Reconsider the model of part (b) of Prob. 4. 

Construct its dual problem. 

Solve this dual problem graphically. 

Use the result from part (b) to identify the nonbasic variables and basic variables 
for the optimal basic solution for the primal problem. 

Use the results from part (c) to obtain the optimal basic solution for the primal 
problem directly by using Gaussian elimination to solve for its basic variables, 
starting from the initial system of equations [excluding Eq. (0)] constructed for the 
simplex method. 

Use the results from part (c) to identify the defining equations (see Sec. 5.1) for the 
optimal corner-point solution. for the primal problem, and then use these equations 
to find this solution. 


Consider the model given in Prob. 37 of Chap. 5. 

Construct the dual problem. 

Use the given information about the basic variables in the optimal primal solution 
to identify the nonbasic variables and basic variables for the optimal dual solution. 
Use the results from part (b) to identify the defining equations (see Sec. 5.1) for the 
optimal corner-point solution for the dual problem, and then use these equations to 
find this solution. 

Solve the dual problem graphically to verify your results from part (c). 


Consider the model given in Prob. 3 of Chap. 3. 

Construct the dual problem for this model. 

Use the fact that (x,, x,) = (13, 5) is optimal for the primal problem to identify the 
nonbasic variables and basic variables for the optimal basic solution for the dual 
problem. 

Identify the optimal basic solution for.the dual problem by directly deriving Eq. (0) 
corresponding to the optimal primal solution identified in part (b). Derive this equa- 
tion by using Gaussian elimination (see Appendix 4). 

Use the results from part (b) to identify the defining equations (see Sec. 5.1) for the 
optimal corner-point solution for the dual problem. Verify your optimal dual solution 
from part (c) by checking to see that it satisfies this system of equations. 


Suppose that you also want information about the dual problem when you apply 


the revised simplex method (see Sec. 5.2) to the primal problem in our standard form. 


(a) How would you identify the optimal solution for the dual problem? 
(b) After obtaining the basic feasible solution at each iteration, how would you identify 
the complementary basic solution in the dual problem? 


17. Consider the primal and dual problems in our standard form presented in matrix 
notation at the beginning of Sec. 6.1. Use only this definition of the dual problem for a primal 
problem in this form to prove each of the following results. 

(a) The weak duality property presented in Sec. 6.1. 

(b) If the primal problem has an unbounded feasible region that permits increasing Z 

indefinitely, then the dual problem has rio feasible solutions. 

(c) If the functional constraints for the primal problem, Ax = b, are changed to Ax = 

b, the only resulting change in the dual problem is to delete the nonnegativity 
constraints, y = 0. 


18.* Construct the dual problem for the linear programming problem given in Prob. 26 
of Chap. 4. 


19. For each of the following linear programming models, convert this primal problem 
into one of the two forms given in Table 6.14 and then construct its dual problem: 

(a) Model given in Prob. 17 of Chap. 4. 

(b) Model given in Prob. 18 of Chap. 4. 

(c) Model given in Prob. 27 of Chap. 4. 


20. Consider the model with equality constraints given in Prob. 25 of Chap. 4. 

(a) Construct its dual problem by using the corresponding primal-dual form given in 
Table 6.14. 

(b) Demonstrate that the answer in part (a) is correct (i.e., equality constraints yield 
dual variables without nonnegativity constraints) by first converting the primal prob- 
lem to our standard form (see Table 6.12), then constructing its dual problem, and 
then converting this dual problem to the form obtained in part (a). 


21.* Consider the model without nonnegativity constraints given in Prob. 22 of 
Chap. 4. 

(a) Construct its dual problem by using the corresponding primal-dual form given in 
Table 6.14. 

(b) Demonstrate that the answer in part (a) is correct (i.e., variables without nonnega- 
tivity constraints yield equality constraints in the dual problem) by first converting 
the primal problem to our standard form (see Table 6.12), then constructing its dual 
problem, and then converting this dual problem to the form obtained in part (a). 


22. Consider the dual problem for the Wyndor Glass Co. example given in Table 6.1. 
Demonstrate that its dual problem is the primal problem given in Table 6.1 by going through 
the conversion steps given in Table 6.13. 


23. Consider the primal and dual problems in our standard form presented in matrix 
notation at the beginning of Sec. 6.1. Let y* denote the optimal solution for this dual problem. 
Suppose that b is then replaced by b. Let X denote the optimal solution for the new primal 
problem. 


Prove that cx = y*b. 
24.* Consider the following problem. 
Maximize Z = 3x, + x. + 443, 
subject to 6x, + 3x, + 5x; 5 25 
3x, + 4x, + 5x, = 20 


and x, =0, x, = 0, x, = 0. 
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The corresponding final set of equations yielding the optimal solution is 


(0) Z + 2x, + $x, + 3x5 = 17. 
a) xy 3X2 + 3x4 = ix; ang 
2) x +x — 3x4 + 3x5 = 3. 


(a) Identify the optimai solution from this set of equations. 

(b) Construct the dual problem. 

(c) Identify the optimal solution for the dual problem from the final set of equations. 
Verify this solution by solving the dual problem graphically. 

(d) Suppose that the original problem is changed to 


Maximize Z = 3x, + 3x. + 4x3, 
subject to 6x, + 2x, + 5x; = 25 


3x, + 3x, + 5x3 = 20 
and x,20, 20, x,20. 


Use duality theory to determine whether the previous optimal solution is still optimal. 
(e) Use the fundamental insight presented in Sec. 5.3 to identify the new coefficients 
of x, in the final set of equations after it has been adjusted for the changes in the 
original problem given in part (d). 
(f) Now suppose that the only change in the original problem is that a new variable 
Xew has been introduced into the model as follows: 


Maximize Z = 3x, + x, + 4x3 + 2x 


new? 


subject to 6x, + 3x, + 5x3 + 3Xyew = 25 
3x, + 4x, + 5x3 + 2x,.y = 20 
and x, 2 0, Xa a 0, X3 = 0, Xnew = 0. 


Use duality theory to determine whether the previous optimal solution, along with 
Xnew = O, is still optimal. 
(g) Use the fundamental insight presented in Sec. 5.3 to identify the coefficients of xX ew 
- as a nonbasic variable in the final set of equations resulting from the introduction 
of X,ew into the original model as shown in part (f): 


25. Consider the model of Prob. 35. Use duality theory directly to determine whether 
the current basic solution remains optimal after each of the following independent changes. 

(a) The change in part (e) of Prob. 35. 

(b) The change in part (g) of Prob. 35. 


26. Consider the model of Prob. 38. Use duality theory directly to determine whether 
the current basic solution remains optimal after each of the following independent changes. 

(a) The change in part (c) of Prob. 38. 

(b) The change in part (f) of Prob. 38. 


27. Consider the model of Prob. 39. Use duality theory directly to determine whether 
the current basic solution remains optimal after each of the following independent changes. 

(a) The change in part (b) of Prob. 39. 

(b) The change in part (d) of Prob. 39. 


28. Reconsider the model of Prob. 24. You are now to conduct sensitivity analysis by 
independently investigating each of the following six changes in the original model. For each 
change, use the sensitivity analysis procedure to revise the given final set of equations (in 


tableau form) and convert it to the proper form for identifying and evaluating the current basic 197 
solution. Then test this solution for feasibility and for optimality. (Do not reoptimize.) 

(a) Change the right-hand side of constraint 1 to b, = 15. 
(b) Change the right-hand side of constraint 2 to b, = 5. 
(c) Change the coefficient of x, in the objective function to c, = 
(d) Change the coefficient of x, in the objective function to c; = 3. 
(e) Change the coefficient of x, in constraint 2 to a», = 1. 
(f) Change the coefficient of x, in constraint 1 to a,, = 10. 
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29. Consider the following problem. 


Maximize Z = 3x, + x, + 2x3, 


subject to X, — X, + 2x3 = 20 
2x, +X. —- x =10 
and x, 20, x, = 0, x, = 0. 


Let x, and x, denote the slack variables for the respective functional constraints. After applying 
the simplex method, the final simplex tableau is 











(a) Perform sensitivity analysis to determine which of the 11 parameters of the model 
are sensitive parameters in the sense that any change in just that parameter’s value 
will change the optimal solution. 

(b) Find the allowable range for each c, without changing the optimal solution. 

(c) Find the allowable range for each b, without changing the optimal set of basic 
variables. 


30. For the problem given in Table 6.20, find the allowable range for c, without chang- 
ing the optimal solution. Show your work algebraically, using the tableau given in Table 6.20. 
Then justify your answer from a geometric viewpoint, referring to Fig. 6.3. 

31. For the original Wyndor Glass Co. problem, use the last tableau in Table 4.8 to do 
the following. 

(a) Find the allowable range for each b, without changing the optimal set of basic 

variables. ; ` 
(b) Find the allowable range for c, and c, without changing the optimal solution. 


32. For the Case 4 example presented in Sec. 6.7, use the last tableau in Table 6.23 to 
do the following. 
(a) Find the allowable range for each b; without changing the optimal set of basic 
variables. 
(b) Find the allowable range for c, and c, without changing the optimal solution. 


33. Consider the following problem. 
Maximize Z = 2x, + 5x2, 
subject to x, + 2x, = 10 


x, + 3x, = 12 
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and x, = 0, x, Z 0. 


Let x3 and x, denote the slack variables for the respective functional constraints. After we apply 
the simplex method, the final simplex tableau is 









Right 
x, | Side 












While doing post-optimality analysis, you learn that all four b; and c; values used in the original 
model just given are accurate only to within +50 percent. In other words, their ranges of likely 
values ae 5 = b, = 15, 6 = b, = 18,1 Sc, S 3, and 2.5 = c, = 7.5. Your job now is to 
perform sensitivity analysis to determine for each parameter whether this uncertainty is likely 
to affect the optimal solution. Specifically, for each parameter, determine the allowable range 
of values for which the current basic solution (perhaps with new values for the basic variables) 
will remain optimal. Then divide up the range of likely values between these allowable values 
and other values for which the current basic solution will no longer be optimal. 

(a) Perform this sensitivity analysis: graphically on the original model. 

(b) Now perform this sensitivity analysis as described and illustrated in Sec. 6.7 for b, 

and ¢,. 
(c) Repeat part (b) for b». 
(d) Repeat part (b) for cz. 


34. Consider the following problem. 
Maximize Z = 3x, + 4x, + 8x3, 
subject to 2x, + 3x, + 5x3 59 
x, + 2x, + 3x35 


and x, 20, x, =0, x32 0. 


Let x, and x5 denote the slack variables for the respective functional constraints. After we apply 
the simplex method, the final simplex tableau is 
















Basic 
Variable Side 
Z 14 
xy 2 
X3 





While doing post-optimality analysis, you learn that some of the parameter values used in the 
original model just given are just rough estimates, where the range of likely values in each case 
is within +50 percent of the value used here. For each of these following parameters, perform 
sensitivity analysis to determine whether this uncertainty is likely to affect the optimal solution. 
Specifically, for each parameter, determine the allowable range of values for which the current 
basic solution (perhaps with new values for the basic variables) will remain optimal. Then 


divide up the range of likely values between these allowable values and other values for which 
the current basic solution will no longer be optimal. 

(a) The parameter b}. 

(b) The parameter c3. 

(c) The parameter a). 

(d) The parameter c3. 

(e) The parameter aj). 

(f) The parameter b. 


35.* Consider the following problem. 
Maximize Z >= — 5x, + 5x, + 13x3, 
subject to -x + xX + 3x3 = 20 
12x, + 4x, + 10x; = 90 
and x, = 0 (j = 1, 2, 3). 


If we let x, and x; be the slack variables for the respective constraints, the simplex method 
yields the following final set of equations: 


(0) Z + 2x, + 5x4 = 100. 
(1) Sp big Bx te ty = 20. 
(2) 16x, — 2x, — 4x4 + x; = 10. 


Now you are to conduct sensitivity analysis by independently investigating each of the following 
nine changes in the original model. For each change, use the sensitivity analysis procedure to 
revise this set of equations (in tableau form) and convert it to proper form from Gaussian 
elimination for identifying and evaluating the current basic solution. Then test this solution for 
feasibility and for optimality. (Do not reoptimize.) 

(a) Change the right-hand side of constraint 1 to 


b, = 30. 
(b) Change the right-hand side of constraint 2 to 
b, = 70. 


(c) Change the right-hand sides to 


lel = [e 


(d) Change the coefficient of x; in the objective function to 


cz = 8. 
(e) Change the coefficients of x, to 
Cy 2, 
au] = 0]. 
az 5 
(f) Change the coefficients of x, to 
Cy 6 
a2] = 42). 
An 5 
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(g) Introduce a new variable x, with coefficients 


C6 10 
ae) = | 3]. 
426 5 


(h) Introduce a new constraint 2x, + 3x, + 5x, = 50. (Denote its slack variable 
by %6.) 
(i) Change constraint 2 to 


10x, + 5x, + 10x, = 100. 
36.* Reconsider the model of Prob. 35. Suppose that we now want to apply parametric 


linear programming analysis to this problem. Specifically, the right-hand sides of the functional 
constraints are changed to 


20 + 26 (for constraint 1) 
90 - 0 (for constraint 2), 


where @ can be assigned any positive or negative values. 

Express the basic solution (and Z) corresponding to the original optimal solution as a 
function of 6. Determine the lower and upper bounds on @ before this solution would become 
infeasible. 


37. Consider the example for Case 3 of sensitivity analysis in Sec. 6.7, where ¢, = 3, 
Gy = 3, Gx. = 4, and where the other parameters are given in Table 6.20. Starting from the 
resulting final tableau given at the bottom of Table 6.21, construct a table like Table 6.24 to 
perform parametric linear programming analysis, where 


c = 3+ 8, c= 3 +20. 
How far can 0 be increased above 0 before the current basic solution is no longer optimal? 
38. Consider the following problem. 
Maximize Z = 2x, — X, + Xy, 
subject to 3x, + x + x s60 
XxX, — X, + 2x, = 10 
x +X x = 20 


and x = 0, x, = 0, x32 0. 


Let x4, Xs, and x, denote the slack variables for the respective constraints. After we apply the 
simplex method, the final simplex tableau is 


No. 
0 


1 





Now you are to conduct sensitivity analysis by independently investigating each of the following 
six changes in the original model. For each change, use the sensitivity analysis procedure to 
revise this set of equations (in tableau form) and convert it to proper form from Gaussian 


elimination for identifying and evaluating the current basic solution. Then test this solution for 
feasibility and for optimality. (Do not reoptimize.) 
(a) Change the right-hand sides 


b, 60 b, 70 
from b, | = | 10 to b,| = | 20}. 
bs 20 bz 10 


(b) Change the coefficients of x, 


Cy 2 Cy 1 

auf 13 anj _ 12 
from oe 1 to a 2k 

a3; 1 a3, 0 

(c) Change the coefficients of x, 
C3 1 C3 2 
from 13 : to 13 | = ° ; 

a23 2 a23 1 
a33 =] a33 -2 


(d) Change the objective function to Z = 3x, — 2x, + 3x3. 
(e) Introduce a new constraint 3x, — 2x. + x3 = 30. (Denote its slack variable by x7.) 
(f) Introduce a new variable x; with coefficients 


Cg -1 
as | _ | —2 
ag 1y 
dg 2 


39. Consider the following problem. 
Maximize Z = 2x, + Tx, — 3x3, 
subject to x, + 3x, + 4x3 = 30 
x + 4%) —- x3 = 10 
and x, = 0, x, = 0, x, 20. 


Letting x, and x, be the slack variables for the respective constraints, the simplex method yields 
the following final set of equations: 


(0) Z + XxX, + Xs + 2x; = 20. 
(1) — xXx, + 5x, + x4 - x; = 20. 
(2) x, + 4%, - x3 + xs = 10. 


Now you are to conduct sensitivity analysis by independently investigating each of the following 
seven changes in the original model. For each change, use the sensitivity analysis procedure 
to revise this set of equations (in tableau form) and convert it to proper form from Gaussian 
elimination for identifying and evaluating the current basic solution. Then test this solution for 
feasibility and for optimality. (Do not reoptimize.) 


(a) Change the right-hand sides to 
b,| _ | 20 
bz 30 [7 
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(b) Change the coefficients of x; to 


C3 —2 
a3 | = 3]. 
a3 —2 
(c) Change the coefficients of x, to 
Cy 4 
ayy 3}. 
a21 2 


(d) Introduce a new variable x, with coefficients 


C6 3 
ae} =] 14. 
da6 2 


(e) Change the objective function to Z = x, + 5x, — 2x3. 
(f) Introduce a new constraint 3x, + 2x, + 3x, = 25. 
(g) Change constraint 2 to x, + 2x, + 2x; = 35. 


40. Reconsider the model of Prob. 39. Suppose that we now want to apply parametric 
linear programming analysis to this problem. Specifically, the right-hand sides of the functional 
constraints are changed to 


30 + 30 (for constraint 1) 
10- @ (for constraint 2), 


where 0 can be assigned any positive or negative values. 

Express the basic solution (and Z) corresponding to the onginal optimal solution as a 
function of 9. Determine the lower and upper bounds on 0 before this solution would become 
infeasible. 


41. Consider the following problem. 
Maximize Z = 2x, — Xo + X3, 
subject to 3x, — 2x, + 2x, = 15 
-x4 + X%+ 435 3 
X, 7 nt x35 4 
and x, 20, x, = 0, x,=0. 


If we let x4, x5, and x, be the slack variables for the respective constraints, the simplex method 
yields the following final set of equations: 


(0) Z + 2x3 + x4 + x5 = 18. 
(1) X + 5x3 + x4 + 3x5 = 24, 
(2) 2X3 + xs tx = 7. 
(3) xı + 4x3 + x4 + 2x5 = 21. 


Now you are to conduct sensitivity analysis by independently investigating each of the following 
eight changes in the original model. For each change, use the sensitivity analysis procedure to 
revise this set of equations (in tableau form) and convert it to proper form from Gaussian 
elimination for identifying and evaluating the current basic solution. Then test this solution for 
feasibility and for optimality. (Do not reoptimize.) 


(a) Change the right-hand sides to 203 
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(b) Change the coefficient of x, in the objective function to c; = 2. 
(c) Change the coefficient of x, in the objective function to c} = 3. 
(d) Change the coefficients of x3 to 


C3 4 

a3 3 

da3 2y 

a33 1 

(e) Change the coefficients of x, and x, to 

Cy 1 C2 —2 
ay) _ 1 anj _|-1 
ay; —2 and azz 3p 
a3, 3 Az 2 


respectively. 


(f) Change the objective function to Z = Sx, + x, + 3x3. 
(g) Change constraint 1 to 2x, — x, + 4x; = 12. 
(h) Introduce a new constraint 2x, + x, + 3x; = 60. 


42. Reconsider part (d) of Prob. 41. Use duality theory directly to determine whether 
the original optimal solution is still optimal. 


43. Reconsider the model of Prob. 41. Suppose that you now have the option of making 
trade-offs in the profitability of the first two activities, whereby the objective function coefficient 
of x, can be increased by any amount by simultaneously decreasing the objective function 
coefficient of x, by the same amount. Thus the alternative choices of the objective function are 


Z(0) = (2 + Ox, — (1 + Ox. + x3, 


where any nonnegative value of 0 can be chosen. 

Construct a table like Table 6.24 to perform parametric linear programming analysis on 
this problem. Determine the upper bound on 0 before the original optimal solution would 
become nonoptimal. Then determine the best choice of 0 over this range. 


44. Consider the following problem. 
Maximize Z = 10x, + 4x, 
subject to 3x, + x, = 30 
2x, + x, = 25 


and x, = 0, x, = 0. 


Let x; and x, denote the slack variables for the respective functional constraints. After applying 
the simplex method, the final simplex tableau is 










Basic 
Variable 
Z 
x2 
x 
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(1) The first constraint is changed to 4x, + x, = 40. 
(2) Parametric programming is introduced to change the objective function to the alter- 
native choices of 


Linear Programming 


Z(@) = (10 — 26)x, + 4 + @)x,, 


where any nonnegative value of 0 can be chosen. 

(a) Construct the resulting revised final tableau (as a function of 6), and then convert 
this tableau to proper form from Gaussian elimination. Use this tableau to identify 
the new optimal solution that applies for either 6 = O or sufficiently small values 
of 0. 

(b) What is the upper bound on 0 before this optimal solution would become nonoptimal? 

(c) Over the range of 0 from zero to this upper bound, which choice of @ gives the 
largest value of the objective function? 


45. Consider the following problem. 
l Maximize Z = 9x, + 8x, + 5x3, 
subject to 2x, + 3x, + x35 4 
5x, + 4x, + 3x3 = 11 
and x, = 0, x, 20, x; 2 0. 


Let x, and x; denote the slack variables for the respective functional constraints. After we apply 
the simplex method, the final simplex tableau is 


-Basic 
Variable 
Z 
Xi 
X3 











(a) Suppose that a new technology has become available for conducting the first activity 
considered in this problem. If the new technology were to be adopted to replace the 
existing one, the coefficients of x, in the model would change 


c 9 ci 18 
from a, { = 42 to aul =] 3]. 
a2) 5 a2) 6 


Use the sensitivity analysis procedure to investigate the potential effect and desira- 
bility of adopting the new technology. Specifically, assuming. it were adopted, con- 
struct the resulting revised final tableau, convert this tableau to proper form from 
Gaussian elimination, and then reoptimize (if necessary) to find the new optimal 
solution. 

(b) Now suppose that you have the option of mixing the old and new technologies for 
conducting the first activity. Let @ denote the fraction of the technology used that 
is from the new technology, so 0 = 6 = 1. Given 0, the coefficients of x, in the 
model become 


Cy 9 + 96 
ail = }2+ Of. 
A>, 5+ 0 


Construct the resulting revised final tableau (as a function of 6), and convert this 
tableau to proper form from Gaussian elimination. Use this tableau to identify the 
current basic solution as a function of 6. Over the allowable values of 0 = 6 = 1, 
give the range of values of @ for which this solution is both feasible and optimal. 
What is the best choice of 0 within this range? 


46. Consider the following problem. 
Maximize Z = 3x, + 5x5 + 2x3, 
subject to —2x, + 2x, + x35 5 
3x, + xX, — x, = 10 
and x, 20, xX, 20, x,20. 


Let x, and x; be the slack variables for the respective functional constraints. After applying the 
simplex method, the final simplex tableau is 


Basic 
Variable 


Z 
Xi 
X3 











Parametric linear programming analysis now is to be applied simultaneously to the objective 
function and right-hand sides, where the model in terms of the new parameter is the following: 


Maximize ZO) = (3 + 20)x, + (5 + Ox, + (2 — 8x3, 
subject to —2x, + 2x, +x; = 5 + 66 
3x, + Xx, — x, = 10 — 80 
and x, 20, x, 20, x,20. 
Construct the resulting revised final tableau (as a function of 6), and convert this tableau to 
proper form from Gaussian elimination. Use this tableau to identify the current basic solution 
as a function of 6. For 0 = 0, give the range of values of 0 for which this solution is both 
feasible and optimal. What is the best choice of @ within this range? 
47. Consider the following problem. 

Minimize Yo = Sy, + 4z, 

subject to 4y, + 3y,24 


W, + yo =3 

yı + 2y,21 

Yit y2 22 
and y 20, yn = 0. 


Because this primal problem has more functional constraints than variables, suppose that the 
simplex method has been applied directly to its dual problem. If we let x, and x, denote the 
slack variables for this dual problem, the resulting final simplex tableau is 
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Coefficient of 





Basic 
Variable 








For each of the following independent changes in the original. primal model, you now are to 
conduct sensitivity analysis by directly investigating the effect on the dual problem and then 
inferring the complementary effect on the primal problem. For each change, apply the procedure 
for sensitivity analysis summarized at the end of Sec. 6.6 to the dual problem (do not reop- 
timize), and then give your conclusions as to whether the current basic solution for the primal 
problem still is feasible and whether it still is optimal. Then check your conclusions by a direct 
graphical analysis of the primal problem. 

(a) Change the objective function to yy = 3y, + Syp. 

(b) Change the right-hand sides of the functional constraints to 3, 5, 2, and 3, respec- 

tively. 
(c) Change the first constraint to 2y, + 4y, = 7. 
(d) Change the second constraint to 5y, + 2y, = 10. 


48. Consider the Wyndor Glass Co. problem described in Sec. 3.1. Suppose that, in 
addition to considering the introduction of two new products, management now is also consid- 
ering changing the production rate of a certain old product that is still profitable. Refer to Table 
3.1. The capacity used per unit production rate of this old product is 1, 4, and 3 for Plants 1, 
2, and 3, respectively. Therefore, if we let @ denote the change (positive or negative) in the 
production rate of this old product, the right-hand side of the three functional constraints in 
Sec. 3.1 becomes (4 — 0), (12 — 46), and (18 — 36), respectively. Thus choosing a negative 
value of 0 would free additional capacity for producing more of the two new products, whereas 
a positive value would have the opposite effect. 

(a) Use a parametric linear programming formulation to determine the effect of different 
choices of 6 on the optimal solution for the product: mix of the two new products 
given in the final tableau of Table 4.8. In particular, use the fundamental insight of 
Sec. 5.3 to obtain expressions for Z and the basic variables (x3, x,, and x,) in terms 
of 6, assuming that @ is sufficiently close to zero that this ‘‘final’’ basic solution 
still is feasible and thus optimal for the given value of 0. 

(b) Now consider the broader question of the choice of 0 along with the product mix 
for the two new products. What is the break-even unit profit for the old product (in 
comparison with the two new products) below which its production rate should be 
decreased (@ < 0) in favor of the new products and above which its production rate 
should be increased (6 > 0) instead? 

(c) If the unit profit is above this break-even point, how much can the old product’s 
production rate be increased before the ‘‘final’’ basic feasible solution would become 
infeasible? 

(d) If the unit profit is below this break-even point, how much can the old product’s 
production rate be decreased (assuming its previous rate was larger than this de- 
crease) before the ‘‘final’’ basic feasible solution would become infeasible? 


49. Consider the following problem. 
Maximize Z = 2x, — X, + 3x3, 
subject to XR ag Hee S83 
xX, — 2x, + x32 1 


2x, + x3 2 


and x20, +20, *x,20. 207 


Suppose that the Big M method (see Sec. 4.6) is used to obtain the initial (artificial) basic Duality Theory and 
feasible solution. Let x, be the artificial slack variable for the first constraint, x, the slack Sensitivity Analysis 
(surplus) variable for the second constraint, X, the artificial variable for the second constraint, 

and x, the slack variable for the third constraint. The corresponding final set of equations 

yielding the optimal solution is 


(0) Z + 5x, + (M + 2)x, + Mx, + x7 = 8 
(1) x) XS + Xa -x= 1 
(2) 2x. + x5 +x, =2 
(3) 3X5 + Z txs — Xe = 2. 


Suppose that the original objective function is changed to Z = 2x, + 3x, + 4x3, and that the 
original third constraint is changed to 2x, + x; = 1. Use the sensitivity analysis procedure to 
revise the final set of equations (in tableau form) and convert it to proper form from Gaussian 
elimination for identifying and evaluating the current basic solution. Then test this solution for 
feasibility and for optimality. Reoptimize (if needed), starting from this final tableau, to find 
the new optimal solution. 


Special Types of Linear 
Programming Problems 


Chapter 3 emphasized the wide applicability of linear programming. We continue to 
broaden our horizons in this chapter by discussing some particularly important types 
of linear programming problems. These special types share several key characteristics. 
The first is that they all arise frequently in a variety of contexts. They also tend to 
require a very large number of constraints and variables, so a straightforward computer 
application of the simplex method may require an exorbitant computational effort. 
Fortunately, another characteristic is that most of the a,; coefficients in the constraints 
are zeroes, and the relatively few nonzero coefficients appear in a distinctive pattern. 
As aresult, it has been possible to develop special streamlined versions of the simplex 
method that achieve dramatic computational savings by exploiting this special struc- 
ture of the problem. Therefore, it is important to become sufficiently familiar with 
these special types of problems so that you can recognize them when they arise and 
apply the proper computational procedure. 

To describe special structures, we shall introduce the table (matrix) of constraint 
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Table 7.1 Table of Constraint Coefficients 
for Linear Programming 
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coefficients shown in Table 7.1, where a; is the coefficient of the jth variable in the 
ith functional constraint. Later, portions of the table containing only coefficients equal 
to zero will be indicated by leaving them blank, whereas blocks containing nonzero 
coefficients will be shaded darker. 

Probably the most important special type of linear programming problem is the 
so-called transportation problem, and we shall describe it first. Its special solution 
procedure also will be presented, partially to illustrate the kind of streamlining of the 
simplex method that can be obtained by exploiting the special structure in the problem. 
Next we shall present two special types of linear programming problems (the trans- 
shipment problem and the assignment problem) that are closely related to the trans- 
portation problem, and finally we shall describe a special type that frequently arises 
in multidivisional organizations. 


7.1 The Transportation Problem 


Prototype Example 


One of the main products of the P & T COMPANY is canned peas. The peas are 
prepared at three canneries (near Bellingham, Washington; Eugene, Oregon; and 
Albert Lea, Minnesota) and then shipped by truck to four distributing warehouses in 
the western United States (Sacramento, California; Salt Lake City, Utah; Rapid City, 
South Dakota; and Albuquerque, New Mexico), as shown in Fig. 7.1. Because the 
shipping costs are a major expense, management is initiating a study to reduce them 
as much as possible. For the upcoming season, an estimate has been made of what 
the output will be from each cannery, and each warehouse has been allocated a certain 
amount from the total supply of peas. This information (in units of truckloads), along 
with the shipping cost per truckload for each cannery-warehouse combination, is given 
in Table 7.2. Thus there are a total of 300 truckloads to be shipped. The problem 
now is to determine which plan for assigning these shipments to the various cannery- 
warehouse combinations would minimize the total shipping cost. 

This is actually a linear programming problem of the transportation problem 
type. To formulate the model, let Z denote total shipping cost, and let ty (i = 1, 2, 3; 
j= 1, 2, 3, 4) be the number of truckloads to be shipped from cannery i to warehouse 
j. Thus the objective is to choose the values of these 12 decision variables (the x;;) 
so as to 


Minimize Z = 464x,, + 513x,. + 654x)3 + 867x,, + 352x,, + 416x2, 
+ 690x + 791x24 + 995x3; + 682x33 + 388x33 + 685x34, 
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Figure 7.1 Location of canneries and warehouses for the P & T Co. 






Table 7.2 Shipping Data for P & T Co. 211 


Shipping Cost ($) Per Truckload Special Types of Linear 
Programming 
Warehouse Problems 

1 2 3 4 Output 










464 513 654 867 75 
352 416 690 791 125 
682 388 100 


70 





subject to the constraints 


Xu + X + Xi + Xj = 75 
X21 F X22 + X73 + X74 = 125 


X31 + X39 + X33 + X34 = 100 


Xii + X Taai = 80 
Xy2 + X29 + X39 = 65 
X13 + X93 + X33 = 70 
Xia + Xp + X34 = 85 
and x; 20 (i = 1,2, 3;7 = 1, 2, 3, 4). 


Table 7.3 shows the constraint coefficients. As you will see next, it is the special 
structure in the pattern of these coefficients that distinguishes this problem as a trans- 
portation problem, not its context. 

By the way, the optimal solution for this problem is x, = 0, x,. = 20, 
X13 = 0, x4 = 55, x) = 80, xy = 45, x3 = 0, x34 = 0, x3) = 0, X3 = 0, 
X33 = 70, X34 = 30. When you learn the optimality test that appears in Sec. 7.2, 
you will be able to verify this yourself (see Prob. 9). 


Table 7.3 Table of Constraint Coefficients for the P & T Co. 


Coefficient of 


Xua X2 X3 Xy Xn Xn Xn Xaq XA 32X33 X34 


Cannery 
constraints 


Warehouse 
constraints 
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The Transportation Problem Model 


To describe the general model for the transportation problem, we need to use terms 
that are considerably less specific than those for the components of the prototype 
example. In particular, the general transportation problem is concerned (literally or 
figuratively) with distributing any commodity from any group of supply centers, called 
sources, to any group of receiving centers, called destinations, in such a way as to 
minimize the total distribution cost. The correspondence in terminology between the 
prototype example and the general problem is summarized in Table 7.4. 

Thus, in general, source i (i = 1,2,...,m) has a supply of s; units to 
distribute to the destinations, and destination j (j = 1,2, ..., n) has a demand for 
d; units to be received from the sources. A basic assumption is that the cost of 
distributing units from source i to destination j is directly proportional to the number 
distributed, where c;; denotes the cost per unit distributed. As for the prototype ex- 
ample, these input data can be summarized conveniently in the cost and requirements 
table shown in Table 7.5. 

Letting Z be total distribution cost and x; (i = 1,2,...,mj=1,2,...,n) 
be the number of units to be distributed from source i to destination j, the linear 
programming formulation of this problem becomes 


m 


Minimize Z= > 2 CX 
EE 


ij’ 


M: 
E 
Z: 

li 

un 


subject to 5 fori = 1,2,...,m 
j=1 
S y= d; forj=1,2,...,n 
fat 

and X= 0, for all i and j. 


Note that the resulting table of constraint coefficients has the special structure shown 
in Table 7.6. Any linear programming problem that fits this special formulation is of 
the transportation problem type, regardless of its physical context. In fact, there have 
been numerous applications unrelated to transportation that have been fitted to this 
special structure, as we shall illustrate in the next example. (The assignment problem 
described in Sec. 7.4 is an additional example.) This is one of the reasons why the 
transportation problem is generally considered the most important special type of linear 
programming problem. 

For many applications, the supply and demand quantities in the model (the s; 
and d;) have integer values, and implementation will require that the distribution 


Table 7.4 Terminology for the Transportation Problem 
SS 





Prototype Example ; General Problem 

Truckloads of canned peas Units of a commodity 
Three canneries m sources 
Four warehouses n destinations 
Output from cannery i s; supply from source i 
Allocation to warehouse j d; demand at destination j 
Shipping cost per truckload from ` | c, cost per unit distributed from 

cannery i to warehouse j source i to destination j 





Table 7.5 Cost and Requirements Table for 
the Transportation Problem 





Cost Per Unit Distributed 





Destination 
1 2 ag n Supply 
l | cn C12 Cin sy 
C21 Cy Con S2 
Source 
Me \ Coe Wag RE Ae, s, 











Demand d, die tees od, 


quantities (the x) also have integer values. Fortunately, because of the special struc- 
ture shown in Table 7.6, such problems have the following property. 


Integer solutions property: For transportation problems where every s; and d, 
has an integer value, all the basic variables (allocations) in every basic feasible 
solution (including an optimal one) also have integer values. 


The solution procedure described in Sec. 7.2 deals only with basic feasible 
solutions, so it automatically will obtain an integer optimal solution for this case. 
Therefore, it is unnecessary to add a constraint to the model that the x; must have 
integer values. 

However, in order to have an optimal solution of any kind, a transportation 
problem must possess feasible solutions. The following property indicates when this 
will occur. 


Feasible solutions property: A necessary and sufficient condition for a trans- 
portation problem to have any feasible solutions is that 


m n 
` S; = PS d;. 
i=1 j=l 


Table 7.6 Table of Constraint Coefficients for the Transportation Problem 


Coefficient of 


n nu ðn * Xen 007 Xm Öm 7'7 Xm 
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Supply 
constraints 


Demand 
constraints 
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This property may be verified by observing that the constraints require that both 


n m n 


m 
PY S; and > d; equal > Xij 
i=l j=1 i=] j=1 

This condition that the total supply must equal the total demand merely requires that 
the system be in balance. If the problem has physical significance and this condition 
is not met, it usually means that either s; or d; actually represents a bound rather than 
an exact requirement. If this is the case, a fictitious ‘‘source’’ or ‘‘destination”’ (called 
the dummy source or the dummy destination) can be introduced to take up the slack 
in order to convert the inequalities into equalities and satisfy the feasibility condition. 
The next two examples illustrate how to do this conversion, as well as how to fit 
some other common variations into the transportation problem formulation. 


EXAMPLE—PRODUCTION SCHEDULING: The NORTHERN AIRPLANE COM- 
PANY builds commercial airplanes for various airline companies around the world. 
The last stage in the production process is to produce the jet engines and then to 
install them (a very fast operation) in the completed airplane frame. The company has 
been working under some contracts to deliver a considerable number of airplanes in 
the near future, and the production of the jet engines for these planes must now be 
scheduled for the next 4 months. 

To meet the contracted dates for delivery, the company must supply engines for 
instaliation in the quantities indicated in the second column of Table 7.7. Thus the 
cumulative number of engines produced by the end of months 1, 2, 3, and 4 must be 
at least 10, 25, 50, and 70, respectively. 

The facilities that will be available for producing the engines vary according to 
other production, maintenance, and renovation work scheduled during this period. 
The resulting monthly differences in the maximum number that can be produced and 
the cost (in millions of dollars) of producing each one are given in the third and fourth 
columns of Table 7.7. 

Because of the variations in production costs, it may well be worthwhile to 
produce some of the engines a month or more before they are scheduled for instal- 
lation, and this possibility is being considered. The drawback is that such engines 
must be stored until the scheduled installation (the airplane frames will not be ready 
early) at a storage cost of $15,000/month (including interest on expended capital) for 
each engine,! as shown in the last column of Table 7.7. 


Table 7.7 Production Scheduling Data for Northern 
Airplane Co. 

















Scheduled 
Month | Installations 


Unit Cost* 
of Production 


Unit Cost* 
of Storage 










Maximum 
Production 





1 10 

2 15 0.015 
3 25 0.015 
4 


20 


* Cost is expressed in millions of dollars. 


l For modeling purposes, assume that this storage cost is incurred at the end of the month for just those 
engines that are being held over into the next month. Thus engines that are produced in a given month for 
installation in the same month are assumed to incur no storage cost. 


The production manager wants a schedule developed for the number of engines 
to be produced in each of the 4 months so that the total of the production and storage 
costs will be minimized. 

One way to formulate a mathematical model for this problem is to let x; be the 
number of jet engines to be produced in month j, for j = 1, 2, 3, 4. By using only 
these four decision variables, the problem can be formulated as a linear programming 
problem that does not fit the transportation problem type. (See Prob. 20.) 

On the other hand, by adopting a different viewpoint, we can instead formulate 
the problem as a transportation problem that requires much less effort to solve. This 
viewpoint will describe the problem in terms of sources and destinations and then 
identify the corresponding x,, C; S; and dj. (See if you can do this before reading 
further.) 

Because the units being distributed are jet engines, each of which is to be 
scheduled for production in a particular month and then installed in a particular (per- 
haps different) month, 


Source i = production of jet engines in monthi (i = 1, 2, 3, 4) 


installation of jet engines in month; (j = 1, 2, 3, 4) 


ll 


Destination j 


x; = number of engines produced in month i for installation in 
month j 


c = cost associated with each unit of Xij 


_ | cost per unit for production and any storage, if i = j 
2, ifi>7 : 
s=? 


d; = number of scheduled installations in month j. 


The corresponding (incomplete) cost and requirements table is given in Table 7.8. 
Thus it remains to identify the missing costs and the supplies. 

Since it is impossible to produce engines in one month for installation in an 
earlier month, x; must be zero if i > j. Therefore, there is no real cost that can be 
associated with such x, Nevertheless, in order to have a well-defined transportation 
problem to which a standard software package (solution procedure of Sec. 7.2) can 
be applied, it is necessary to assign some value for the unidentified costs. Fortunately, 
we can use the Big M method introduced in Sec. 4.6 to assign this value. Thus we 
assign a very large number (denoted by M for convenience) to the unidentified cost 


Table 7.8 Incomplete Cost and Requirements Table 
for the Northern Airplane Co. 


Cost Per Unit Distributed 
















Destination 
l 2 3 4 Supply 
1 1.080 1.095 1.110 1.125 
Source 2 ? 1.110 1.125 1.140 ? 
3j ? ? 1100 1115 2 
SARR ? ? 1.130 7 











Demand 15 25 
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entries in Table 7.8 to force the corresponding values of x, to be zero in the final 
solution. 

The numbers that need to be inserted into the supply column of Table 7.8 are 
not obvious because the ‘*supplies,”’ the amount produced in the respective months, 
are not fixed quantities. In fact, the objective is to solve for the most desirable values 
of these production quantities. Nevertheless, it is necessary to assign some fixed 
number to every entry in the table, including those in the supply column, to have a 
transportation problem. A clue is provided by the fact that although the supply con- 
straints are not present in the usual form, these constraints do exist in the form of 
upper bounds on the amount that can be supplied. namely, 


Xyp t+ Xp + yg + X yg = 25, 





Xx, + Xa + Xo, + Xa = 35, 
X31 + X32 + X33 + X34 S 30, 
X4 + Xyp + X43 + X4, = 10. 


The only change from the standard model for the transportation problem is that these 
constraints are in the form of inequalities instead of equations. 

To convert these inequalities to equations in order to fit the transportation prob- 
lem model, we use the familiar device of slack variables as introduced in Sec. 4.2. 
In this context, the slack variables are allocations to a single dummy destination that 
represent the unused production capacity in the respective months. This change per- 
mits the supply in the transportation problem formulation to be the total production 
capacity in the given month. Furthermore, because the demand for the dummy des- 
tination is the total unused capacity, this demand is 


(25 + 35 + 30 + 10) — (10 + 15 + 25 + 20) = 30. 


With this demand included, the sum of the supplies now equals the sum of the 
demands, which is the condition given by the feasible solutions property for having 
feasible solutions. 

The cost entries associated with the dummy destination should be zero because 
there is no cost incurred by a fictional allocation. (Cost entries of M would be inap- 
propriate for this column because we do not want to force the corresponding values 
of xX; to be zero. In fact, these values need to sum to 30.) 

The resulting final cost and requirements table is given in Table 7.9, with the 
dummy destination labeled as destination 5(D). Using this formulation, it is quite easy 


Table 7.9 Complete Cost and Requirements Table for the 
Northern Airplane Co. 


Cost Per Unit Distributed 








Destination 
1 2 3 4 5(D) | Supply 
l 1.080 1.095 1.110 1.125 0 25 
s 2 M 1.110 1.125 1.140 0 35 
owce 3| m M 1.100 1115 0 30 
4 M M M 1.130 0 10 
Demand 10 15 25 20 30 








to find the optimal production schedule by the solution procedure described in Sec. 
7.2. (See Prob. 19 and its answer in the back of the book.) 


EXAMPLE — DISTRIBUTION OF WATER RESOURCES: The METRO WATER DIS- 
TRICT is an agency that administers the distribution of water in a certain large geo- 
graphic region. The region is fairly arid, so the District must purchase and bring in 
water from outside the region. The sources of this imported water are the Colombo, 
Sacron, and Calorie Rivers. The District then resells the water to users in its region. 
Its main customers are the water departments of the cities of Berdoo, Los Devils, San 
Go, and Hollyglass. 

It is possible to supply any of these cities with water brought in from any of 
the three rivers, with the exception that no provision has been made to supply Holly- 
glass with Calorie River water. However, because of the geographic layouts of the 
viaducts and the cities in the region, the cost to the District of supplying water depends 
upon both the source of the water and the city being supplied. The variable cost per 
acre foot of water (in dollars) for each combination of river and city is given in Table 
7.10. Despite these variations, the price per acre foot charged by the District is 
independent of the source of the water and is the same for all cities. 

The management of the District is now faced with the problem of how to allocate 
the available water during the upcoming summer season. Using units of | million 
acre feet, the amounts available from the three rivers are given in the right-hand 
column of Table 7.10. The District is committed to providing a certain minimum 
amount to meet the essential needs of each city (with the exception of San Go, which 
has an independent source of water), as shown in the Min. needed row of the table. 
The Requested row indicates that Los Devils desires no more than the minimum 
amount, but that Berdoo would like to buy as much as 20 more, San Go would buy 
up to 30 more, and Hollyglass will take as much as it can get. 

Management wishes to allocate all the available water from the three rivers to 
the four cities in such a way as to at least meet the essential needs of each city while 
minimizing the total cost to the District. 


Formulation: Table 7.10 already is close to the proper form for a cost and require- 
ments table, with the rivers being the sources and the cities being the destinations. 
However. the one basic difficulty is that it is not clear what the demands at the 
destinations should be. The amount to be received at each destination (except Los 
Devils) actually is a decision variable, with both a lower and an upper bound. This 
upper bound is the amount requested unless the request exceeds the total supply 
remaining after meeting the minimum needs of the other cities, in which case this 


Table 7.10 Water Resources Data for Metro Water District 
Cost ($) Per Acre Foot 








City 
a a Berdoo Los Devils San Go Hollyglass Supply 
Colombo River 16 13 22 | 17 50 
Sacron River 14 13 19 15 60 
Calorie River 19 20 23 | — 50 
Min. needed 30 70 0 10 (in units of 
Requested 50 70 30 x million acre feet) 
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remaining supply becomes the upper bound. Thus insatiably thirsty Hollyglass has an 
upper bound of 


(50 + 60 + 50) — (30 + 70 + 0) = 60. 


Unfortunately, just like the other numbers in the cost and requirements table of 
a transportation problem, the demand quantities must be constants, not bounded de- 
cision variables. To begin resolving this difficulty, temporarily suppose that it is not 
necessary to satisfy the minimum needs, so that the upper bounds are the only con- 
straints on amounts to be allocated to the cities. In this circumstance, can the requested 
allocations be viewed as the demand quantities for a transportation problem formu- 
lation? After one adjustment, yes! (Do you see already what the needed adjustment 
is?) 

The situation is analogous to Northern Airplane Co.’s production scheduling 
problem, where there was excess supply capacity. Now there is excess demand ca- 
pacity. Consequently, rather than introducing a dummy destination to “receive” the 
unused supply capacity, the adjustment needed here is to introduce a dummy source 
to ‘‘send’* the unused demand capacity. The imaginary supply quantity for this dummy 
source would be the amount by which the sum of the demands exceeds the sum of 
the real supplies: 


(50 + 70 + 30 + 60) — (50 + 60 + 50) = 50. 





This formulation yields the cost and requirements table shown in Table 7.11, 
which uses units of million acre feet and million dollars. The cost entries in the Dummy 
row are zero because there is no cost incurred by the fictional allocations from this 
dummy source. On the other hand, a huge unit cost of M is assigned to the Calorie 
River-Hollyglass spot. The reason is that Calorie River water cannot be used to supply 
Hollyglass and assigning a cost of M will prevent any such allocation. 

Now let us see how we can take each city’s minimum needs into account in 
this kind of formulation. Because San Go has no minimum need, it is already all set. 
Similarly, the formulation for Hollyglass does not require any adjustments because 
its demand (60) exceeds the dummy source's supply (50) by 10, so the amount 
supplied to Hollyglass from the real sources will be at least 10 in any feasible solution. 
Consequently, its minimum need of 10 from the rivers is guaranteed. (If this coinci- 
dence had not occurred, Hollyglass would need the same adjustments that we shall 
have to make for Berdoo.) 

Los Devils’ minimum need equals its requested allocation, so its entire demand 
of 70 must be filled from the real sources rather than the dummy source. This re- 


Table 7.11 Cost and Requirements Table Without Minimum Needs for Metro Water District 


Cost Per Unit Distributed 








Destination 
Berdoo Los Devils San Go Hollyglass Supply 
Colombo R. 16 13 22 17 50 
s , Sacron R. 14 13 19 15 60 
ource Calorie R. 19 20 23 M 50 
Dummy 0 0 0 0 50 
Demand 50 70 30 60 











Table 7.12 Cost and Requirements Table for Metro Water District 


Cost Per Unit Distributed 














Destination 
B.(min.) B.(extra) LD. S.G. H. 
] 2 3 4 2 Supply 
Col. R. 1 16 16 13 22 17 50 
Source Sac. R. 2 14 14 13 19 15 60 
ee Gae Bs a 19 19 20 23 M 50 
Dummy 4D) M 0 M 0 0 50 
Demand 30 20 70 30 60 
a Se ee ee de 


quirement calls for the Big M method! Assigning a huge unit cost of M to the allocation 
from the dummy source to Los Devils ensures that this allocation will be zero in an 
optimal solution. 

Finally, consider Berdoo. In contrast to the case of Hollyglass, the dummy 
source has an adequate (fictional) supply to ‘‘provide’’ at least some of Berdoo’s 
minimum need in addition to its extra requested amount. Therefore, since Berdoo’s 
minimum need is 30, adjustments must be made to prevent the dummy source from 
contributing more than 20 to Berdoo’s total demand of 50. This adjustment is accom- 
plished by splitting Berdoo into two destinations, one having a demand of 30 with a 
unit cost of M for any allocation from the dummy source and the other having a 
demand of 20 with a unit cost of zero for the dummy source allocation. This for- 
mulation gives the final cost and requirements table shown in Table 7.12. 

This problem will be solved in the next section to illustrate the solution procedure 
presented there. 


7.2 A Streamlined Simplex Method for the 
Transportation Problem 


Because the transportation problem is just a special type of linear programming prob- 
lem, it can be solved by applying the simplex method as described in Chap. 4. 
However, you will see in this section that some tremendous computational shortcuts 
can be obtained in this method by exploiting the special structure shown in Table 7.6. 
We shall refer to this streamlined procedure as the transportation simplex method. 

As you read on, note particularly how the special structure is exploited to achieve 
great computational savings. Then bear in mind that comparable savings sometimes 
can be achieved by exploiting other types of special structures as well, including those 
described later in the chapter. 


Setting Up the Transportation Simplex Method 


To highlight the streamlining achieved by the transportation simplex method, let us 
first review how the general (unstreamlined) simplex method would set up the trans- 
portation problem in tabular form. After constructing the table of constraint coeffi- 
cients (see Table 7.6), converting the objective function to maximization form, and 
using the Big M method to introduce artificial variables z,, Z3, ... , Zm+n into the 
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Table 7.13 Original Simplex Tableau before Applying Simplex 
Method to Transportation Problem 


Basic 
Variable 





zZ 














(m + n) respective equality constraints (see Sec. 4.6), typical columns of the simplex 
tableau would have the form shown in Table 7.13, where all entries not shown in 
these columns are zeroes. [The one remaining adjustment before the first iteration of 
the simplex method is algebraically to eliminate the nonzero coefficients of the initial 
(artificial) basic variables in row 0.] 

After any subsequent iteration, row 0 then would have the form shown in Table 
7.14. Because of the pattern of zeroes and ones for the coefficients in Table 7.13, the 
fundamental insight presented in Sec. 5.3 implies that the u; and v, would have the 
following interpretation: 


u; = multiple of original row i that has been subtracted (directly or indirectly) from 
original row 0 by simplex method during all iterations leading to current simplex 
tableau. 


v; = multiple of original row (m + j) that has been subtracted (directly or indirectly) 
from original row 0 by simplex method during all iterations leading to current 
simplex tableau. 


You might recognize the u; and v, from Chap. 6 as being the dual variables.} If Xy; 18 
a nonbasic variable, (c; — u; — v;) is interpreted as the rate at which Z would change 
as xy is increased. 

To lay the groundwork for simplifying this setup, recall what information is 
needed by the simplex method. In the initialization step, an initial basic feasible 


Table 7.14 Row 0 of Simplex Tableau When-Applying Simplex Method to Transportation Problem 









I Coefficient of 
Basic 


Variable 











Z 


1 Tt would be easier to recognize these variables as dual variables by relabeling all these variables as y; and 
then changing all the signs in row 0 of Table 7.14 by converting the objective function back to its original 
minimization form. 


solution must be obtained, which is done artificially by introducing artificial variables 
as the initial basic variables and setting them equal to the s; and d;. The optimality 
test and part 1 of the iterative step (selecting an entering basic variable) require 
knowing the current row 0, which is obtained by subtracting a certain multiple of 
another row from the preceding row 0. Part 2 (determining the leaving basic variable) 
must identify the basic variable that reaches zero first as the entering basic variable 
is increased, which is done by comparing the current coefficients of the entering basic 
variable and the corresponding right side. Part 3 must determine the new basic feasible 
solution, which is found by subtracting certain multiples of one row from the other 
rows in the current simplex tableau. 

Now, how does the transportation simplex method obtain the same information 
in much simpler ways? This story will unfold fully in the coming pages, but here are 
some preliminary answers. 

First, no artificial variables are needed, because a simple and convenient pro- 
cedure (with several variations) is available for constructing an initial basic feasible 
solution. 

Second, the current row 0 can be obtained without using any other row simply 
by calculating the current values of the u; and v, directly. Since each basic variable 
must have a coefficient of zero in row 0, the current u; and v; are obtained by solving 
the set of equations 


CyT u; = u= for each i and j such that x; is a basic variable, 


which can be done in a straightforward way. (Note how the special structure in Table 
7.13 makes this convenient way of obtaining row 0 possible by yielding c; — u; — 
v; as the coefficient of x; in Table 7.14.) 

Third, the leaving basic variable can be identified in a simple way without 
(explicitly) using the coefficients of the entering basic variable. The reason is that the 
special structure of the problem makes it easy to see how the solution must change 
as the entering basic variable is increased. As a result, the new basic feasible solution 
also can be identified immediately without any algebraic manipulations on the rows 
of the simplex tableau. 

The grand conclusion is that almost the entire simplex. tableau (and the work of 
maintaining it) can be eliminated! Besides the input data (the Cy S , and d; values), 
the only information needed by the transportation simplex method is te ieni basic 
feasible solution,! the current values of the u; and Vv; and the resulting values of 
(Cj — u; — v,) for nonbasic variables x. When you solve a problem by hand, it is 
convenient to record this information for each iteration in a transportation simplex 
tableau, such as shown in Table 7.15. [Note carefully that the values of x; and 
(c;; — u; — v,) are distinguished in these tableaux by circling the former but not the 
latter. | 

You can gain a fuller appreciation for the great difference in efficiency and 
convenience between the simplex and the transportation simplex methods by applying 
them both to the same small problem (see Prob. 22). However, the difference becomes 
even more pronounced for large problems that must be solved on a computer. This 
pronounced difference is suggested somewhat by comparing the sizes of the simplex 
and the transportation simplex tableaux. Thus, for a transportation problem having m 


' Since nonbasic variables are automatically zero, the current basic feasible solution is fully identified by 
recording just the values of the basic variables. We shall use this convention from now on. 
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Table 7.15 Format of Transportation Simplex Tableau 


Destination 


























Demand di d, TA d, Z= 
Y 
Additional information to be added in each cell: 
If xy is a If xyisa 
basic variable nonbasic variable 





sources and n destinations, the simplex tableau would have (m + n + 1) rows and 
(m + 1)( + 1) columns (excluding those to the left of the x; columns), and the 
transportation simplex tableau would have m rows and n columns (excluding the two 
extra informational rows and columns). Now try plugging in various values for m and 
n (for example, m = 10 and n = 100 would be a rather typical middle-sized trans- 
portation problem), and note how the ratio of the number of cells in the simplex 
tableau to the number in the transportation simplex tableau increases as m and n 
increase. 


Initialization Step 


Recall that the objective of the initialization step is to obtain an initial basic feasible 
solution. Because all the functional constraints in the transportation problem are equal- 
ity constraints, the simplex method would obtain this solution by introducing artificial 
variables and using them as the initial basic variables, as described in Sec. 4.6. The 
resulting basic solution actually is feasible only for a revised version of the problem, 
so a number of iterations are needed to drive these artificial variables to zero in order 
to reach the real basic feasible solutions. The transportation simplex method bypasses 
all this by instead using a simpler procedure to directly construct a real basic feasible 
solution on a transportation simplex tableau. 

Before outlining this procedure, we need to point out that the number of basic 
variables in any basic solution of a transportation problem is one fewer than you might 
expect. Ordinarily, there is one basic variable for each functional constraint in a linear 
programming problem. For transportation problems with m sources and n destinations, 


the number of functional constraints is m + n. However, 
number of basic variables = m + n — 1. 


The reason is that the functional constraints are equality constraints, and this 
set of (m + n) equations has one extra (or redundant) equation that can be deleted 
without changing the feasible region; i.e., any one of the constraints is automatically 
satisfied whenever the other m + n — 1 constraints are satisfied. (This fact can be 
verified by showing that any supply constraint exactly equals the sum of the demand 
constraints minus the sum of the other supply constraints, and that any demand equa- 
tion also can be reproduced by summing the supply equations and subtracting the 
other demand equations. See Prob. 23.) Therefore, any basic feasible solution appears 
on a transportation simplex tableau with exactly (m + n — 1) circled nonnegative 
allocations, where the sum of the allocations for each row or column equals its supply 
or demand. ! 

The procedure for constructing an initial basic feasible solution selects the 
(m + n — 1) basic variables one at a time. After each selection, a value that will 
satisfy one additional constraint (thereby eliminating that constraint’s row or column 
from further consideration for providing allocations) is assigned to that variable. Thus, 
after (m + n — 1) selections, an entire basic solution has been constructed in such 
a way as to satisfy all the constraints. A number of different criteria have been 
proposed for selecting the basic variables. We present and illustrate three of these 
criteria here after outlining the general procedure. 


General Procedure? for Constructing an Initial Basic Feasible Solution 


To Begin: All source rows and destination columns of the transportation simplex 
tableau are initially under consideration for providing a basic variable (allocation). 


Step 1: From among the rows and columns still under consideration, select the next 
basic variable (allocation) according to some criterion. 


Step 2: Make that allocation large enough to exactly use up the remaining supply in 
its row or the remaining demand in its column (whichever is smaller). 


Step 3: Eliminate that row or column (whichever had the smaller remaining supply 
or demand) from further consideration. (If the row and column have the same re- 
maining supply and demand, then arbitrarily select the row as the one to be eliminated. 
The column will be used later to provide a degenerate basic variable, i.e., a circled 
allocation of zero.) 


Step 4: If only one row or only one column remains under consideration, then the 
procedure is completed by selecting every remaining variable (i.e., those variables 
that were neither previously selected to be basic nor eliminated from consideration by 


' However, note that any feasible solution with (m + n ~ 1) nonzero variables is not necessarily a basic 
solution because it might be the weighted average of two or more degenerate basic feasible solutions (i.e., 
basic feasible solutions having some basic variables equal to zero). We need not be concerned about 
mislabeling such solutions as being basic, however, because the transportation simplex method constructs 
only legitimate basic feasible solutions. 


* In Sec. 4.1 we pointed out that the simplex method is an example of the algorithms (iterative solution 


procedures) so prevalent in operations research work. Note that this procedure also is an algorithm, where 
each successive execution of the (four) steps constitutes an iteration. 
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eliminating their row or column) associated with that row or column to be basic with 
the only feasible allocation. Otherwise, return to step 1. 


Alternative Criteria for Step 1 


1. Northwest corner rule: Begin by selecting x,, (i-e., start in the northwest 
corner of the transportation simplex tableau). Thereafter, if x; was the last basic 
variable selected, then next select x; ;,, (i-e., move one column to the right) if source 
i has any supply remaining. Otherwise, next select x; ; (i-e., move one row down). 


EXAMPLE: To make this description more. concrete, we now illustrate. the general 
procedure on the. Metro Water District problem (see Table 7.12) with the northwest 
corner rule being used in step 1. Because m = 4 and n = 5 in this case, the procedure 
would find an initial basic feasible solution having m + n — 1 = 8 basic variables. 

As shown in Table 7.16, the first allocation is x,,; = 30, which exactly uses up 
the demand in column 1 (and eliminates this column from further consideration). This 
first iteration leaves a supply of 20. remaining in row 1, so next select x, 141 = Xy 
to be a basic variable. Because this supply. is no larger than the demand of 20 in 
column 2, all of it is allocated, x,. = 20, and this row is eliminated from further 
consideration. (Row 1 is chosen for elimination rather than column 2 because of the 
parenthetical instruction in step 3.) Therefore, select x,, 1). = x22 next. Because the 
remaining demand of 0 in column 2 is less than the supply of 60 in row 2, allocate 
Xo, = 0 and eliminate column 2. 

Continuing in this manner, we eventually obtain the entire initial basic feasible 
solution shown in Table 7.16, where the circled numbers are the values of the basic 
variables (x,,; = 30, . . . , X45 = 50) and all the other variables (x,3, etc.) are nonbasic 
variables equal to zero. Arrows have been added to show the order in which the basic 
variables (allocations) were selected. The value of Z for this solution is 


Z = 16(30) + 16(20) + --- + 0650) = 2470 + 10M. 


Table 7.16 Initial Basic Feasible Solution from Northwest Corner Rule 


Destination 
1 2 3 4 5 Supply u; 
16 13 2 17 
1 (30) (20) 
14 | 14 15 
OCE i 


2 
3 (0) 4 G0) 


M 0 M 0 





50 

















50 





4(D) 50 


Cocca ie ie 











Demand 30 20 70 30 60 Z = 2470 + 10M 





2. Vogel’s approximation method: For each row and column remaining under 
consideration, calculate its difference, which is defined as the arithmetic difference 
between the smallest and next-to-the-smallest unit cost (c;;) still remaining in that row 
or column. In that row or column having the largest difference, select the variable 
having the smallest remaining unit cost. (Ties for the largest difference may be broken 
arbitrarily.) 


EXAMPLE: Now let us apply the general procedure to the Metro Water District 
problem by using the criterion for Vogel’s approximation method to select the next 
basic variable in step 1. With this criterion, it is more convenient to work with cost 
and requirements tables (rather than with complete transportation simplex tableaux), 
beginning with the one shown in Table 7.12. At each iteration, after calculating and 
displaying the difference for every row and column remaining under consideration, 
the largest difference is circled and the smallest unit cost in its row or column is 
enclosed in a box. The resulting selection (and value) of the variable having this unit 
cost as the next basic variable is indicated in the lower right-hand corner of the current 
table, along with the row or column thereby being eliminated from further consider- 
ation (see steps 2 and 3 of the general procedure). The table for the next iteration is 
exactly the same except for deleting this row or column and subtracting the last 
allocation from its supply or demand (whichever remains). 

Applying this procedure to the Metro Water District problem yields the sequence 
of cost and requirements tables shown in Table 7.17, where the resulting initial basic 
feasible solution consists of the eight basic variables (allocations) given in the lower 
right-hand corner of the respective cost and requirements tables. 

This example illustrates two relatively subtle features of the general procedure 
that warrant special attention. First, note that the final iteration selects three variables 
(X31; X32, and x33) to become basic instead of the single selection made at the other 
iterations. The reason is that only one row (row 3) remains under consideration at this 
point. Therefore, step 4 of the general procedure says to select every remaining vari- 
able associated with row 3 to be basic. 

Second, note that the allocation of x», = 20 at the next-to-last iteration exhausts 
both the remaining supply in its row and the remaining demand in its column. How- 
ever, rather than eliminate both the row and column from further consideration, step 
3 says to eliminate only the row, saving the column to provide a degenerate basic 
variable later. Column 3 is, in fact, used for just this purpose at the final iteration 
when x33 = 0 is selected as one of the basic variables. For another illustration of this 
same phenomenon, see Table 7.16 where the allocation of x) = 20 results in elim- 
inating only row 1, so that column 2 is saved to provide a degenerate basic variable, 
Xa = 0, at the next iteration. 

Although a zero allocation might seem irrelevant, it actually plays an important 
role. You will see soon that the transportation simplex method must know all (m + 
n — 1) basic variables, including those with value zero, in the current basic feasible 
solution. 


3. Russell’s approximation method: For each source row i remaining under 
consideration, determine its u;, which is the largest unit cost (c;;) still remaining 
in that row. For each destination column j remaining under consideration, determine 
its U, which is the largest unit cost (c;;) still remaining in that column. For each vari- 


able x; not previously selected in these rows and columns, calculate A, = cj — 
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Destination Rae 
1 2 3 4 5 Supply Difference 
1 16 16 B 22 17 50 3 
Source 2 144 14 #13 19 15 60 1 
i 3 19 19 2 23 M 50 0 
0 M 
Demand 20 
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Destination Row 
1 2 3 5 Supply Difference 
1 16 16 13 17 50 3 
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3 19 19 2 M 50 0 
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Demand 30 20 70 40 
Column difference 2 2 0 2 
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Destination Row 
1 2 3 5 Supply Difference 
Source 2114 14 13 15 60 1 
3} 19 19 20 M 50 0 
| 
Demand 30 20 20 40 Select x,; = 40 
Column difference 5 5 7 Eliminate column 5 
Destination Row 
1 2 3 Supply Difference 
2) 14 14 20 1 
Source 3 |19 19 20 | 50 0 
Demand 30 20 20 | Select x, = 20 
Column difference 5 5 ©) Eliminate row 2 
Destination 
1 2 3 | Supply 
Source 3 | 19 19 20 
Demand Select x, = 
Xo = 20 | Z = 2,460 
X33 








Table 7.18 Initial Basic Feasible Solution from Russell’s Approximation Method 



















Largest 
Iteration Negative A; | Allocation 
1 M | dys = —2M xas = 50 
2 2 19 M 19 19 20 23 M|A5=-5-M X15 = 10 
3 2 19 23 19 19 20 23 Ay = —29 X13 = 40 
4 19 2B 19 19 20 23 A = ~26 X23 = 30 
5 19 23 19 19 23 An = —24* Xa = 30 
6 Irrelevant x3, = 0 
X39 = 20 
X34 = 30 
= 2,570 








* Tie with A,. = —24 broken arbitrarily. 


u; — U; Select the variable having the largest (in absolute terms) negative value of 
A,,. (Ties may be broken arbitrarily.) 


EXAMPLE: Using the criterion for Russell’s approximation method in step 1, we 
again apply the general procedure to the Metro Water District problem (see Table 
7.12). The results, including the sequence of basic variables (allocations), are shown 
in Table 7.18. 

At iteration 1, the largest unit cost in row 1 is 4, 
1 is U, = M, and so forth. Thus 


22, the largest in column 





Ay = Cy — #, — 0, = 16 22 M= 6 M. 


Calculating all the A; fori = 1, 2,3, 4 and j 1, 2, 3, 4, 5 shows that A4; = 
O — 2M has the largest negative value, so X45 50 is selected as the first basic 
variable (allocation). This allocation exactly uses up the supply in row 4, so this row 
is eliminated from further consideration. 

Note that eliminating this row changes 0, and V, for the next iteration. Therefore, 
the second iteration requires recalculating the A;; with j = 1, 3, as well as eliminating 
i = 4. The largest negative value now is 


ll 


Aj; = 17-22 —- M = -5 — M, 


so X,; = 10 becomes the second basic variable (allocation), eliminating column 5 
from further consideration. 

The subsequent iterations proceed similarly, but you may want to test your 
understanding by verifying the remaining allocations given in Table 7.18. As with 
the other procedures in this (and other) section, you should find your OR COURSE- 
WARE useful for doing the calculations involved and illuminating the approach. 


Comparison of Alternative Criteria for Step 1 


Now let us compare these three criteria for selecting the next basic variable. 
The main virtue of the northwest corner rule is that it is quick and easy. However, 
because it pays no attention to unit costs (c,;), usually the solution obtained will be 
far from optimal. (Note in Table 7.16 that x+, = 10 even though c35 = M.) Expending 
a little more effort to find a good initial basic feasible solution might greatly reduce 
the number of iterations then required by the transportation simplex method to reach 
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an optimal solution (see Probs. 7 and 12). Finding such a solution is the objective of 


the other two criteria. 


Vogel’s approximation method has been a popular criterion for many years,’ 
partially because it is relatively easy to implement by hand. Because difference rep- 
resents the minimum extra unit cost incurred by failing to make an allocation to the 
cell having the smallest unit cost in that row or column, this criterion does take costs 
into account in an effective way. 

Russell’s approximation method provides another excellent criterion? that is still 
quick to implement on a computer (but not manually). Although more experimentation 
is required to determine which is more effective on the average, this criterion fre- 
quently does obtain a better solution than Vogel’s. (For the example, Vogel’s ap- 
proximation method happened to find the optimal solution with Z = 2,460, whereas 
Russell’s misses slightly with Z = 2,570.) For a large problem, it may be worthwhile 
to apply both criteria and then use the better solution to start the iterations of the 
transportation simplex method. 

One distinct advantage of Russell’s approximation method is that it is patterned 
directly after part 1 of the iterative step for the transportation simplex method (as you 
will see soon), which somewhat simplifies the overall computer code. In particular, 
the 7, and U; have been defined in such a way that the relative values of the (c; — 


i 


y — U) estimate the relative values of the (c; — u 


i 


-v i) that will be obtained when 
the transportation simplex method reaches an optimal solution. 


We now shall use the initial basic feasible solution obtained in Table 7.18 by 
Russell’s approximation method to illustrate the remainder of the transportation sim- 
plex method. Thus, our initial transportation simplex tableau (before solving for the 
u; and v;) is the one shown in Table 7.19. 


Table 7.19 Ynitial Transportation Simplex Tableau (before Obtaining the c;; — 


Russell’s Approximation Method 


Iteration 
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Source 


Demand 
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u; — v;) from 
Destination 
3 4 5 Supply i; 
13 22 17 
AE 
E Ta 
| 60 
à F 
[a] [a] 
50 
| 
[oe] [2 
50 
20 70 30 60 Z = 2,570 


1 Reinfeld, N. V., and W. R. Vogel: Mathematical Programming, Prentice-Hall, Englewood Cliffs, N.J., 


1958. 


? Russell, Edward J.: ‘‘Extension of Dantzig’s Algorithm to Finding an Initial Near-Optimal Basis for the 
Transportation Problem,” Operations Research, 17: 187-191, 1969. 


The next step is to check whether this initial solution is optimal by applying the 
optimality test. 


Optimality Test 


Using the notation of Table 7.14, we can reduce the standard optimality test for the 
simplex method (see Sec. 4.3) to the following for the transportation problem: 


Optimality test: A basic feasible solution is optimal if and only if 
(Cy — u; — U;) = 0 for every (i, j) such that x; is nonbasic.* 


Thus the only work required by the optimality test is the derivation of the values of 
the u; and v, for the current basic feasible solution and then the calculation of these 
(cj — u; — ¥,). 

Since (c) — u; — v;) is required to be zero if x; is a basic variable, the u; and 
v; Satisfy the set of equations 


Cy 5 U; + U; for each (i, j) such that x; is basic. 


There are (m + n — 1) basic variables, and so there are (m + n — 1) of these 
equations. Since the number of unknowns (the u; and v,) is (m + n), one of these 
variables can be assigned a value arbitrarily without violating the equations. (The rule 
we shall adopt is to select the u; that has the largest number of allocations in its row 
and assign it the value of zero.) Because of the simple structure of these equations, 
it is then very simple to solve for the remaining variables algebraically. 

To demonstrate, we give each equation that corresponds to a basic variable in 
our initial basic feasible solution. l 


Xx: 19 = m + v. Set u, = 0, so v; = 19, 
X39: 19 = u + v. v, = 19, 
X34 23 = u + Ug. v4 = 23, 
Xa l4 = m + v, Know v, = 19, sou, = —5. 
X93: 13 = uy + v. Know u = —5, so v, = 18. 
x3 13 = uy, + v. Know v, = 18, sou, = —S. 
Xs: 17 = u + Us. Know uw, = —5, so v; = 22. 
X45. 0 = ug + Us. Know v; = 22, so u, = —22. 


Setting u, = 0 (since row 3 of Table 7.19 has the largest number of allocations, 3) 
and moving down the equations one at a time immediately gives the derivation of 
values for the unknowns shown to the right of the equations. 

Once you get the hang of it, you probably will find it even more convenient to 
solve these equations without writing them down by working directly on the trans- 
portation simplex tableau. Thus, in Table 7.19 you would begin by writing in the 
value u, = 0 and then picking out the circled allocations (x3,, x32, X34) in that row. 
For each one you would set v, = c3; and then look for circled allocations (except in 


' The one exception is that two or more equivalent degenerate basic feasible solutions (i.e., identical 
solutions having different degenerate basic variables equal to zero) can be optimal with only some of these 
basic solutions satisfying the optimality test. This exception is illustrated later in the example (see the 
identical solutions in the last two tableaux of Table 7.23, where only the latter solution satisfies the criterion 
for optimality). E 
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Table 7.20. Completed Initial Transportation Simplex Tableau 
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row 3) in these columns (x,,). Mentally calculate u, = c3} — V}, pick out x3, set 
U3 = C3 — Up, and so on until you have filled in all the values for the u; and v,. 
(Try it.) Then calculate and fill in the value of (cy — u; — v,;) for each nonbasic 
variable x; (i.e., for each cell without a circled allocation), and you will have the 
completed initial transportation simplex tableau shown in Table 7.20. 

We are now in a position to apply the optimality test by checking the value of 
the (cy — u; — v;) given in Table 7.20. Because two of these values, (C35 — u, — 
Us) = —2 and (c4, — u4 — V4) = —1, are negative, we conclude that the current 
basic feasible solution is not optimal. Therefore, the transportation simplex method 
must next go to the iterative step to find a better basic feasible solution. 


Iterative Step 


As with the full-fledged simplex method, the iterative step for this streamlined version 
must determine an entering basic variable (part 1), a leaving basic variable (part 2), 
and then identify the resulting new basic feasible solution (part 3). 


Part 1: Since (c; — u; — v,) represents the rate at which the objective function 
would change as the nonbasic variable x; is increased, the entering basic variable 
must have a negative (c; — u; — v;) to decrease the total cost Z. Thus the candidates 
in Table 7.20 are x, and x,,. To choose between the candidates, select the one having 
the largest (in absolute terms) negative value of (c; — u; — v;) to be the entering 
basic variable, which is x, in this case. 


PART 2: Increasing the entering basic variable from zero sets off a chain reaction 
of compensating changes in other basic variables (allocations) in order to continue 
satisfying the supply and demand constraints. The first basic variable to be decreased 
to zero then becomes the leaving basic variable. 

With x, as the entering basic variable, the chain reaction in Table 7.20 is the 


Table 7.21 Part of Initial Transportation Simplex Tableau 
Showing the Chain Reaction Caused by Increasing the Entering 
Basic Variables x,; 
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relatively simple one summarized in Table 7.21. (We shall always indicate the entering 
basic variable by placing a boxed + sign in its cell.) Thus increasing xz; requires 
decreasing x,; by the same amount to restore the demand of 60 in column 5, which 
in turn requires increasing xı} by this amount to restore the supply of 50 in row 1, 
which in turn requires decreasing x», by this amount to restore the demand of 70 in 
column 3. This decrease in x», successfully completes the chain reaction because it 
also restores the supply of 60 in row 2. (Equivalently, we could have started the chain 
reaction by restoring this supply in row 2 with the decrease in x3, and then increase 
X13 and decrease x,5.) 

The net result is that cells (2, 5) and (1, 3) become recipient cells, each re- 
ceiving its additional allocation from one of the donor cells, (1, 5) and (2, 3). (These 
cells are indicated in Table 7.21 by the + and — signs.) Note that cell (1, 5) had to 
be the donor cell for column 5 rather than cell (4, 5), because cell (4, 5) would have 
no recipient cell in row 4 to continue the chain reaction. [Similarly, if the chain 
reaction had been started in row 2 instead, cell (2, 1) could not be the donor cell for 
this row because the chain reaction could not then be completed successfully after 
necessarily choosing cell (3, 1) as the next recipient cell and either cell (3, 2) or (3, 4) 
as its donor cell.] 

Each donor cell decreases its allocation by exactly the same amount that the 
entering basic variable (and other recipient cells) is increased. Therefore, the donor 
cell that starts with the smallest allocation—cell (1, 5) in this case (sincé 10 < 30 in 
Table 7.21)—must reach a zero allocation first as the entering basic variable x2, is 
increased. Thus x,; becomes the leaving basic variable. 

In general, there always is just one chain reaction (in either direction) that can 
be completed successfully to maintain feasibility when the entering basic variable is 
increased from zero. This chain reaction can be identified by selecting among the cells 
having a basic variable: first the donor cell in the column having the entering basic 
variable, then the recipient cell in the row having this donor cell, then the donor cell 
in the column having this recipient cell, and so on until the chain reaction yields a 
donor cell in the row having the entering basic variable. When a column or row has 
more than one additional basic variable cell, it may be necessary to trace them all 
further to see which one must be selected to be the donor or recipient cell. (All but 
this one eventually will reach a dead end in a row or column having no additional 
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Table 7.22 Part of Second Transportation Simplex. Tableau 
Showing the Changes in the Basic Feasible Solution 
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basic variable cell.) After identifying the chain reaction, the donor cell having the 
smallest allocation automatically provides the leaving basic variable. (In the case of 
a tie for the donor cell having the smallest allocation, any one can be chosen arbitrarily 
to provide the leaving basic variable.) 


PART 3: The new basic feasible solution is identified simply by adding the value 
of the leaving basic variable (before any change) to the allocation for each recipient 
cell and subtracting this same amount from the allocation for each donor cell. In Table 
7.21 the value of the leaving basic variable x,; is 10, so this portion of the transpor- 
tation simplex tableau changes as shown in Table 7.22 for the new solution. (Since 
X,5 is nonbasic in the new solution, its new allocation of zero is no longer shown in 
this new tableau.) 

We can now highlight a useful interpretation of the (c; — u; — v;) quantities 
derived during the optimality test. Because of the shift of 10 allocation units from the 
donor cells to the recipient cells (shown in Tables 7.21 and 7.22), the total. cost 
changes by 


AZ = 10(15 — 17 + 13 — 13) = 10(—2) = 10(c55 — u, — v3). 


Thus the effect of increasing the entering basic variable x», from zero has been a cost 
change at the rate of — 2 per unit increase in x,,. This is precisely what the value of 
(C35 — Uy — Us) = —2 in Table 7.20 indicates would happen. In fact, another (but 
less efficient) way of deriving (c; — u; — v,) for each nonbasic variable x; is to 
identify the chain reaction caused by increasing this variable from O to 1 and then to 
calculate the resulting cost change. This intuitive interpretation sometimes is useful 
for checking. calculations during the optimality test. 

Before completing the solution of the Metro Water District problem, let us now 
summarize the rules for the transportation simplex method. 


Summary of Transportation Simplex Method 


1. Initialization step: Construct an initial basic feasible solution by the proce- 
dure outlined earlier in this section. Go to the optimality test. 
2. Iterative step: 
Part 1. Determine the entering basic variable: Select the nonbasic vari- 


able x; having the largest (in absolute terms) negative value of (c; — u; — 
U;). 

Part 2. Determine the leaving basic variable: Identify the chain reaction 
required to retain feasibility when the entering basic variable is increased. 
From among the donor cells, select the basic variable having the smallest 
value. 

Part 3. Determine the new basic feasible solution: Add the value of the 
leaving basic variable to the allocation for each recipient cell. Subtract this 
value from the allocation for each donor cell. 

3. Optimality test: Derive the u; and v, by selecting the row having the largest 
number of allocations and setting its u; = 0 and then solving the set of equa- 
tions c; = u; + v; for each (i, j) such that x; is basic. If (cy ~ u; — v) = 
0 for every (i, j) such that x; is nonbasic, then the current solution is optimal, 
so stop. Otherwise, go to the iterative step. 


Continuing to apply this procedure to the Metro Water District problem yields 
the complete set of transportation simplex tableaux shown in Table 7.23. Since all 
the (c; — u; — V;) are nonnegative in the fourth tableau, the optimality test identifies 
the set of allocations in this tableau as being optimal, which concludes the algorithm. 

It would be good practice for you to derive the values of the u; and v, given in 
the second, third, and fourth tableaux. Try doing this by working directly on the 
tableaux. Also check out the chain reactions in the second and third tableaux, which 
are somewhat more complicated than the one you already have seen in Table 7.21. 

You should note three special points that are illustrated by this example. First, 
the initial basic feasible solution is degenerate because the basic variable x3, = 0. 
However, this degenerate basic variable causes no complication, because cell (3, 1) 
becomes a recipient cell in the second tableau, which increases x3, to a value greater 
than zero. 

Second, another degenerate basic variable (x3,) arises in the third tableau be- 
cause the basic variables for two donor cells in the second tableau, cells (2, 1) and 
(3, 4), tie for having the smallest value (30). (This tie is broken arbitrarily by selecting 
xz, as the leaving basic variable; if x34 had been selected instead, then x,, would have 
become the degenerate basic variable.) This degenerate basic variable does appear to 
create a complication subsequently, because cell (3, 4) becomes a donor cell in the 
third tableau but has nothing to donate! Fortunately, such an event actually gives no 
cause for concern. Since zero is the amount to be added to or subtracted from the 
allocations for the recipient and donor cells, these allocations do not change. However, 
the degenerate basic variable does become the leaving basic variable, so it is replaced 
by the entering basic variable as the circled allocation of zero in the fourth tableau. 
This change in the set of basic variables changes the values of the u; and v,. Therefore, 
if any of the (c; — u; — v;) had been negative in the fourth tableau, the algorithm 
would have gone on to make real changes in the allocations (whenever all donor cells 
have nondegenerate basic variables). 

Third, because none of the (c; — u; — v,) turned out to be negative in the 
fourth tableau, the equivalent set of allocations in the third tableau is optimal also. 
Thus the algorithm executed one more iteration than necessary. This extra iteration is 
a flaw that occasionally arises in both the transportation simplex method and the 
simplex method because of degeneracy, but it is not sufficiently serious to warrant 
any adjustments in these algorithms. 
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7.3 The Transshipment Problem 





One requirement of the transportation problem is advance knowledge of the method 
of distribution of units from each source i to each destination j, so that the corre- 
sponding cost per unit (c,;) can be determined. Sometimes, however, the best method 
of distribution is not clear because of the possibility of transshipments, whereby 
shipments would go through intermediate transfer points (which might be other sources 
or destinations). For example, rather than shipping a special cargo directly from port 
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1 to port 3, it may be cheaper to include it with regular cargoes from port 1 to port 
2 and then from port 2 to port 3. 

Such possibilities for transshipments could be investigated in advance to deter- 
mine the cheapest route from each source to each destination. However, this might 
be a very complicated and time-consuming task if there are many possible intermediate 
transfer points. Therefore, it may be much more convenient to let a computer algorithm 
solve simultaneously for the amount to ship from each source to each destination and 
the route to follow for each shipment so as to minimize the total shipping cost. 
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This extension of the transportation problem to include the routing decisions is 
referred to as the transshipment problem. 

Fortunately, there is a simple way to.reformulate the transshipment problem to 
fit it back into the format of the transportation problem. Thus the transportation 
simplex method also can be used to solve the transshipment problem. 

To clarify the structure of the transshipment problem and the nature of this 
reformulation, we shall now extend the prototype example for the transportation prob- 
lem to*include transshipments. 


Prototype Example 


After further investigation, the P & T COMPANY (see Sec. 7.1) has found that it 
can cut costs by discontinuing its own trucking operation and using common carriers 
instead to truck its canned peas. Since no single trucking company serves the entire 
area containing all the canneries and warehouses, many of the shipments will need to 
be transferred to another truck at least once along the way. These transfers can be 
made at intermediate canneries or warehouses, or at five other locations (Butte, Mon- 
tana; Boise, Idaho; Cheyenne, Wyoming; Denver, Colorado; and Omaha, Nebraska) 
referred to as junctions, as shown in Fig. 7.2. The shipping cost per truckload between 
each of these points is given in Table 7.24, where a dash indicates that a direct 
shipment is not possible. 

For example, a truckload of peas can still be sent from cannery 1 to warehouse 
4 by direct shipment at a cost of $871. However, another possibility, shown below, 
is to ship the truckload from cannery 1 to junction 2, transfer it to a truck going 
to warehouse 2, and then transfer it again to go to warehouse 4, at a cost of only 
($286 + $207 + $341) = $834. 


871 


CD GD Ar VD 


341 


This possibility is only one of many indirect ways of shipping a truckload from 
cannery 1 to warehouse 4 that needs to be considered, if indeed this cannery should 
send anything to this warehouse. The overall problem is to determine how the output 
from all the canneries should be shipped to meet the warehouse allocations and mini- 
mize the total shipping cost. 

Now let us see how this transshipment problem can be reformulated as a trans- 
portation problem. The basic idea is to interpret the individual truck trips (as opposed 
to complete journeys for truckloads) as being the shipment from a source to a desti- 
nation, and so label all 12 locations (canneries, junctions, and warehouses) as being 
both potential destinations and potential sources for these shipments. To illustrate this 
interpretation, consider the above example where a truckload of peas is shipped from 
cannery 1 to warehouse 4 by being transshipped through junction 2 and then ware- 
house 2. The first truck trip for this shipment has cannery 1 as its source and junction 
2 as its destination, but then junction 2 becomes the source for the second truck trip 
with warehouse 2 as its destination. Finally, warehouse 2 becomes the source for the 
third trip with this same shipment, where warehouse 4 then is the destination. In a 
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Figure 7.2 Location of canneries, warehouses, and junctions for the P & T Co. 


Table 7.24 Independent Trucking Data for P & T Co. 


Shipping Cost Per Truckload 























To Cannery Junction Warehouse 
From 1 2 3 1 2 3 4 5 1 2 3 4 | Output 

1 $146 — | $324 $286 — = — | $452 $505 — $8371 75 
Cannery 2 | $146 — | $373 $212 $570 $609 — | $335 $407 $688 $784 | 125 

By lll aes 2 $658 — $405 $419 $158 | — $685 $359 $673 100 

1 | $322 $371 $656 $262 $398 $430 — | $503 $234 $329 — 

2 | $284 $210 — | $262 $406 $421 $644 | $305 $207 $464 $558 
Junction 3 | — $569 $403 | $398 $406 $ 81 $272 | $597 $253 $171 $282 

4} — $608 $418 | $431 $422 $81 $287 | $613 $280 $236 $229 

5| = — $158 | — $647 $274 $288 $831 $501 $293 $482 

1 | $453 $336 — | $505 $307 $599 $615 $831 $359 $706 $587 
Warehouse 2 | $505 $407 $683 | $235 $208 $254 $281 $500 | $357 $362 $341 
arehouse 3 | — $687. $357 | $329 $464 $171 $236 $290 | $705 $362 $457 

4 | $868 $781 $670 | — $558 $282 $229 $480 $340 $457 

ic: T 7 

Allocation 80 65 70 85 











similar fashion, any of the 12 locations can become a source, a destination, or both, 


for truck trips. 


Thus, for the reformulation as a transportation problem, we have 12 sources 
and 12 destinations. The c; unit costs for the resulting cost and requirements table 
shown in Table 7.25 are just the shipping costs per truckload already given in Table 
7.24. The impossible shipments indicated by dashes in Table 7.24 are assigned a huge 
unit cost of M. Because each location is both a source and a destination, the diagonal 
elements in the cost and requirements table represent the unit cost of a shipment from 
a given location to itself. The costs of these fictional shipments going nowhere are 


zero. 


Table 7.25 Cost and Requirements Table for the P & T Co. Transshipment Problem Formulated as a 
Transportation Problem 














Destination 
(Canneries) (Junctions) (Warehouses) 
1 2 3 4 5 6 7 8 9 10 li 12 Supply 
sists ee 
1 0 146 M 324 286 M M M 452. 505 M 871 375 
(Canneries) 2 | 146 0 M 373 212 570 609 M 335° 407 688 784 425 
3 M M 0 658 M 405 419 158 M 685 359 673 400 
4 | 322 371 036 | 0 262 398 430 M 503: 234 329 M 300 
5 | 284 210 M 262 0 406 421 644 | 305 207 464 558 300 
Source (Junctions) 6 M 569 403 | 398 406 0 81 272 | 597: : 253 171 282 300 
7 M 608 418 | 431 422 81 0 287 | 613. 280 236 229 300 
8 M M 158 M 647 274 288 0 831. 501 293 482 300 
jarre 

9 | 453 336 M 505 307 599 615 831 0 359 706 587 300 
(Warehouses) 10 | 505 407 683 | 235 208 254 281 500 | 357 0 362 341 300 
11 M 687 357 | 329 464 171 236 290 | 705 362 0 457 300 
12 | 868 781 670 Ai M 558 282 229 480 | 587 340 457 0 300 

Demand 300 300 300 | 300 300 300 300 300 | 380 365 370 385 
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To complete the reformulation of this transshipment problem as a transportation 
problem, we now need to explain how to obtain the demand and supply quantities in 
Table 7.25. The number of truckloads transshipped through a location should be 
included in both the demand for that location as a destination and the supply for that 
location as a source. Since we do not know this number in advance, we instead add 
a safe upper bound on this number to both the original demand and supply for that 
location (shown as allocation and output in Table 7.24) and then introduce the same 
slack variable into its demand and supply constraints. (This single slack variable 
thereby serves the role of both a dummy source and a dummy destination.) Since it 
never would pay to return a truckload to be transshipped through the same location 
more than once, a safe upper bound on this number for any location is the total 
number of truckloads (300), so we shall use 300 as the upper bound. The slack variable 
for both constraints for location 7 would be x;, the (fictional) number of truckloads 
shipped from this location to itself. Thus, (300 — x;) is the real number of truckloads 
transshipped through location i. 

Adding 300 to each of the allocation and demand quantities in Table 7.24 (where 
blanks are zeroes) now gives us the complete cost and requirements table shown in 
Table 7.25 for the transportation problem formulation of our transshipment problem. 
Therefore, using the transportation simplex method to obtain an optimal solution for 
this transportation problem provides an optimal shipping plan (ignoring the x;) for 
the P & T Company. 


General Features 


Our prototype example illustrates all the general features of the transshipment problem 
and its relationship to the transportation problem. Thus the transshipment problem can 
be described in general terms as being concerned with how to allocate and route units 
(truckloads of canned peas in the example) from supply centers (canneries) to receiving 
centers (warehouses) via intermediate transshipment points (junctions, other supply 
centers, and other receiving centers). In addition to transshipping units, each supply 
center generates a given net surplus of units to be distributed, and each receiving 
center absorbs a given net deficit, whereas the junctions neither generate nor absorb 
any units. (The problem has feasible solutions only if the total net surplus generated 
at the supply centers equals the total net deficit to be absorbed at the receiving centers.) 

A direct shipment may be impossible (c; = M) for certain pairs of locations. 
In addition, certain supply centers and receiving centers may not be able to serve as 
transshipment points at all. In the reformulation of the transshipment problem as a 
transportation problem, the easiest way to deal with any such center is to delete its 
column (for a supply center) or its row (for a receiving center) in the cost and re- 
quirements table, and then add nothing to its original supply or demand quantity. 

A positive cost c; is incurred for each unit sent directly from location i (a supply 
center, junction, or receiving center) to another location j. The objective is to deter- 
mine the plan for allocating and routing the units that minimizes the total cost. 

The resulting mathematical model for the transshipment problem (see Prob. 27) 
has a special structure slightly different from that for the transportation problem. As 
in the latter case, it has been found that some applications that have nothing to do 
with transportation can be fitted to this special structure. However, regardless of the 
physical context of the application, this model always can be reformulated as an 
equivalent transportation problem in the manner illustrated by the prototype example. 
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7.4 The Assignment Problem 





The assignment problem is the special type of linear programming problem where 
the resources are being allocated to the activities on a one-to-one basis. Thus each 
resource or assignee (e.g., an employee, machine, or time slot) is to be assigned 
uniquely to a particular activity or assignment (e.g., a task, site, or event). There is 
a cost c; associated with assignee i (i = 1, 2,..., 2) performing assignment j 
(j = 1, 2,..., 2), so that the objective is to determine how all the assignments 
should be made in order to minimize the total cost. 


Prototype Example 


The JOB SHOP COMPANY has purchased three new machines of different types. 
There are four available locations in the shop where a machine could be installed. 
Some of these locations are more desirable than others for particular machines because 
of their proximity to work centers that would have a heavy work flow to and from 
these machines. Therefore, the objective is to assign the new machines to the available 
locations to minimize the total cost of materials handling. The estimated cost per unit 
time of materials handling involving each of the machines is given in Table 7.26 for 
the respective locations. Location 2 is not considered suitable for machine 2. There 
would be no work flow between the new machines. 

To formulate this problem as an assignment problem, we must introduce a 
dummy machine for the extra location. Also, an extremely large cost M should be 
attached to the assignment of machine 2 to location 2 to prevent this assignment in 
the optimal solution. The resulting assignment problem cost table is shown in Table 
7.27. This cost table contains all of the necessary data for solving the problem. The 
optimal solution is to assign machine | to location 4, machine 2 to location 3, and 
machine 3 to location 1, for a total cost of 29. The dummy machine is assigned to 
location 2, so this location is available for some future real machine. 

We shall discuss how this solution is obtained after formulating the mathematical 
model for the general assignment problem. 


The Assignment Problem Model and Solution Procedures 


The mathematical model for the assignment problem uses the following decision 
variables: 


ot 1, if assignee i performs assignment j 
i 0, if not 


Table 7.26 Materials-Handling Cost 
Data for the Job Shop Co. 


Location 
1 2 3 4 


Machine 2 
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Table 7.27 Cost Table for the Job Shop 
Co. Assignment Problem 


Assignment 
(Location) 


1 2 3 4 





1 13 16 12 ll 
Assignee 2 15 M B 20 
(Machine) 3 5 7 10 6 
4(D) 0 0 0 0 





fori = 1,2,...,nandj = 1,2,..., a. Thus each x, is a binary variable (0 or 

1). As discussed at length in the chapter on integer programming (Chap. 13), binary 

variables are important in operations research for representing yes-or-no decisions. In 

this case, the yes-or-no decisions are: Should assignee i perform assignment j? 
Letting Z denote total cost, the assignment problem model is 


n 


M= 


Minimize Z= CijXij 
i= j=l ` 
n 
subject to Xj = 1, fori=1,2,...,n 
j=l 
n 
eet, forj = 1,2,..., 7n 
i=l 
and Xj Z 0, for all i andj 


y binary, for all 7 and j). 


The first set of functional constraints specifies that each assignee performs exactly one 
assignment, whereas the second set requires each assignment to be performed by 
exactly one assignee. /f we delete the parenthetical restriction that the x, be binary, 
the model clearly is a special type of linear programming problem and so can be 
readily solved. Fortunately, for reasons about to unfold, we can delete this restriction. 
(This deletion is the reason the assignment problem appears in this chapter rather than 
the integer programming chapter.) 

Now compare this model (without the binary restriction) with the transportation 
problem model presented in the second subsection of Sec. 7.1 (including Table 7.6). 
Note how similar their structures are. In fact, the assignment problem is just a special 
type of transportation problem where the sources now are assignees and the desti- 
nations now are assignments, and where 


number of sources (m) = number of destinations (n), 
every supply s; = 1, 
every demand d; = 1. 


Now focus on the integer solutions property in the subsection on the transpor- 
tation problem model. Because every s; and d; are integers (= 1) now, this property 
implies that every basic feasible solution (including an optimal one) is integer for an 
assignment problem. The functional constraints of the assignment problem model 
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Table 7.28 Cost and Requirements Table for the Assignment Problem Formulated as a 
Transportation Problem, Illustrated by the Job Shop Co. Example 














(a) General Case (b) Job Shop Co. Example 
Cost Per Unit Cost Per Unit 
Distributed Distributed 
Destination Destination 
1 2 ai P Supply (location) 
1 2 3 4 | Supply 
l Cii C12 ee Cin 1 
' ee uk, Aas E, l 1 13 16 12 1 1 
puree A a : Source 2 15 M 13 20 1 
; : ; (machine) 3 5 7 10 6 1 
m=n | Cy Cy 7° * C l 4(D) 0 0 0 0 ] 
_|—__ 
Demand 1 1 eos 1 Demand i 1 l l 














prevent any variable from being greater than one, and the nonnegativity constraints 
prevent values less than zero. Therefore, by deleting the binary restriction to enable 
solving an assignment problem as a linear programming problem, the resulting basic 
feasible solutions obtained (including the final optimal solution) automatically will 
satisfy the binary restriction anyway. 

For any particular assignment problem, practitioners normally do not bother 
writing out the full mathematical model. It is simpler to formulate the problem by 
Alling out a cost table (e.g., Table 7.27), including identifying the assignees and 
assignments, since this table contains all the essential data in a far more compact 
form. 

Because the assignment problem is a special type of transportation problem, one 
convenient way to solve any particular assignment problem is to apply the transpor- 
tation simplex method described in Sec. 7.2. This approach requires converting the 
cost table to a cost and requirements table for the equivalent transportation problem, 
as shown in Table 7.28a. 

For example, Table 7.28b shows the cost and requirements table for the Job 
Shop Co. problem that is obtained from the cost table of Table 7.27. When the 
transportation simplex method is applied to this transportation problem formulation, 
the resulting optimal solution has basic variables x,, = 0, x4 = 1, X% = 1,43, = 1, 
X4, = 0, xy = 1, x43 = 0. (You are asked to verify this solution in Prob. 32.) The 
degenerate basic variables (x; = 0) and the assignment for the dummy machine 
(X42 = 1) don’t mean anything for the original problem, so the real assignments are 
machine | to location 4, machine 2 to location 3, and machine 3 to location 1. 

It is no coincidence that this optimal solution provided by the transportation 
simplex method has so many degenerate basic variables. For any assignment problem, 
the transportation problem formulation shown in Table 7.28a has m = n. Transpor- 
tation problems in general have (m + n — 1) basic variables (allocations), so every 
basic feasible solution for this particular kind of transportation problem has (27 — 1) 
basic variables, but exactly n of these x; = |. Therefore, there always are (n — 1) 
degenerate basic variables (x; = 0). As discussed at the end of Sec. 7.2, degenerate 
basic variables do not cause any major complication in the execution of the algorithm. 
However, they do frequently cause wasted iterations, where nothing changes (same 














allocations) except for the labeling of which allocations of zero correspond to degen- 
erate basic variables rather than nonbasic variables. These wasted iterations are a 
major drawback for applying the transportation simplex method in this kind of situ- 
ation, where there always are so many degenerate basic variables. 

Another drawback of the transportation simplex method here is that it is purely 
a general-purpose algorithm for solving all transportation problems. Therefore, it does 
nothing to exploit the additional special structure in this special type of transportation 
problem (m = n, every s; = l, and every d; = 1). Although we will not take the 
space to describe them,!' specialized algorithms have been developed to fully stream- 
line the procedure for solving just assignment problems. These algorithms operate 
directly on the cost table, and do not bother with degenerate basic variables. When a 
computer code is available for one of these algorithms, it generally should be used in 
preference to the transportation simplex method. 


Example— Assigning Products to Plants 


The BETTER PRODUCTS COMPANY has decided to initiate the production of four 
new products, using three plants that currently have excess production capacity. The 
products require a comparable production effort per unit, so the available production 
capacity of the plants is measured by the number of units of any product that can be 
produced per day, as given in the last column of Table 7.29. The bottom row gives 
the required production rate per day to meet projected sales. Each plant can produce 
any of these products, except that Plant 2 cannot produce product 3. However, the 
variable costs per unit of each product differ from plant to plant, as shown in the 
main body of Table 7.29. 

Management now needs to make a decision on how to split up the production 
of the products among plants. Two kinds of options are available. 


Option 1: Permit product splitting, where the same product is produced in more 
than one plant. 
Option 2: Prohibit product splitting. 


This second option imposes a constraint that can only increase the cost of an optimal 
solution based on Table 7.29. On the other hand, the key advantage of Option 2 is 
that it eliminates some hidden costs associated with product splitting that are not 
reflected in Table 7.29, including extra setup, distribution, and administration. There- 


Table 7.29 Data for the Better Products Co. 
Problem 















Product 
Unit Cost Capacity 
l 2 3 4 Available 
1 4l 27 28 24 75 
2 40 29 — 23 75 
3 37 30 27 2i 45 
Production rate | 20 30 30 40 











' See Chap. 6 of Selected Reference 9 at the end of this chapter for a description of two algorithms for 
the assignment problem. 
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fore, management wants both options analyzed before making a final decision. For 
Option 2, management further specifies that every plant should be assigned at least 
one of the products. 

We will formulate and solve the model for each option in turn, where Option 
1 leads to a transportation problem and Option 2 leads to an assignment problem. 


FORMULATION OF OPTION 1: With product splitting permitted, Table 7.29 can be 
converted directly to a cost and requirements table for a transportation problem. The 
plants become the sources and the products become the destinations (or vice versa), 
so the supplies are the available production capacities and the demands are the required 
production rates. Only two changes need to be made in Table 7.29. First, because 
Plant 2 cannot produce product 3, such an allocation is prevented by assigning it a 
huge unit cost of M. Second, the total capacity (75 + 75 +°45 = 195) exceeds the 
total required production (20 + 30 + 30 + 40 = 120), so a dummy destination 
with a demand of 75 is needed to balance these two quantities. The resulting cost and 
requirements table is shown in Table 7.30. 

The optimal solution for this transportation problem has basic variables (allo- 
cations) xy = 30, x3 = 30, xis = 15, x4 = 15, x5 = 60, x3, = 20, and x3, = 
25, so 


Plant 1 produces all of products 2 and 3, 
Plant 2 produces half of product 4, 
Plant 3 produces half of product 4 and all of product 1. 


The total cost is Z = 3,270. 


FORMULATION OF OPTION 2: . Without product splitting, each product must be 
assigned to just one plant. Therefore, the products can be interpreted as the assign- 
ments for an assignment problem, where the plants are the assignees. 

Management has specified that every plant should be assigned at least one of 
the products. There are more products (four) than plants (three), so one of the plants 
will need to be assigned two products. Plant 3 has only enough excess. capacity to 
produce one product (see Table 7.29), so either Plant 1 or Plant 2 will take the extra 
product. 

To make. this assignment. of an extra product possible within an assignment 


Table 7.30 Cost and Requirements Table for the 
Transportation Problem Formulation of Option 1 
for the Better Products Co. Problem 


Cost Per Unit Distributed 









Destination (Product) 
2 3 4 5D) 








Source 


(plant) 3 





Demand 






Table 7.31 Cost Table for the Assignment 
Problem Formulation of Option 2 for the Better 
Products Co. Problem 
Assignment (Product) 
1 2 3 4 5(D) 





2b | 800 870 M 920 
3 740 900 810 840 


la | 820 810 840 960 0 

Assisner 1b | 820 810 840 960 0 

(Plant) 2a | 800 870 M 920 
M 





problem formulation, Plants 1 and 2 each are split into two assignees, as shown in 
Table 7.31. 

The number of assignees (now five) must equal the number of assignments (now 
four), so a dummy assignment (product) is introduced into Table 7.31 as 5(D). The 
role of this dummy assignment is to provide the fictional second product to either 
Plant 1 or Plant 2, whichever one receives only one real product. There is no cost 
for producing a fictional product so, as usual, the cost entries for the dummy assign- 
ment are zero. The one exception is the entry of M in the last row of Table 7.31. The 
reason for M here is that Plant 3 must receive a real product, so the Big M method 
is needed to prevent the assignment of the fictional product to Plant 3. 

The remaining cost entries in Table 7.31 are not the unit costs shown in Table 
7.29 or 7.30. For an assignment problem, the cost c; is the total cost associated with 
assignee i performing assignment j. For Table 7.31, the total cost (per day) for Plant 
ito produce product j is the unit cost of production times the number of units produced 
(per day), where these two quantities for the multiplication are given separately in 
Table 7.29. (As in Table 7.30, M again is used to prevent the infeasible assignment 
of product 3 to Plant 2.) 

The optimal solution for this assignment problem is 


Plant 1 produces products 2 and 3, 
Plant 2 produces product 1, 
Plant 3 produces product 4, 


where the dummy assignment is given to Plant 2. The total cost is Z = 3,290. 

As usual, one way to obtain this optimal solution is to convert the cost table of 
Table 7.31 to a cost and requirements table for the equivalent transportation problem 
(see Table 7.28) and then apply the transportation simplex method. Because of the 
identical rows in Table 7.31, this approach can be streamlined by combining the five 
assignees into three sources with supplies 2, 2, and 1, respectively. (See Prob. 31.) 
This streamlining also decreases by two the number of degenerate basic variables in 
every basic feasible solution. 

Now look back and compare this solution to the one obtained for Option 1, 
which included the splitting of product 4 between Plants 2 and 3. The allocations are 
somewhat different for the two solutions, but the total costs are virtually the same 
(Z = 3,270 for Option 1 versus Z = 3,290 for Option 2). Therefore, considering 
the hidden costs associated with product splitting, management decided to adopt the 
Option 2 solution. 
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7.5 Multidivisional Problems 


Another important class of linear programming problems having an exploitable special 
structure consists of multidivisional problems. Their special feature is that they 
involve coordinating the decisions of the separate divisions of a large organization. 
Because the divisions operate with considerable autonomy, the problem is almost 
decomposable into separate problems, where each division is concerned only with 
optimizing its own operation. However, some overall coordination is required in order 
to best divide certain organizational resources among the divisions. 

As a result of this special feature, the table of constraint coefficients for mul- 
tidivisional problems has the block angular structure shown in Table 7.32. (Recall 
that shaded blocks represent the only portions of the table that have any nonzero a, 
coefficients.) Thus each smaller block contains the coefficients of the constraints for 
one subproblem, namely, the problem of optimizing the operation of a division con- 
sidered by itself. The long block at the top gives the coefficients of the linking 
constraints for the master problem, namely, the problem of coordinating the activities 
of the divisions by dividing organizational resources among them so as to obtain an 
overall optimal solution for the entire organization. 

Because of their nature, multidivisional problems frequently are very large, 
containing many hundreds or even thousands of constraints and variables. Therefore, 
it may be necessary to exploit the special structure in order to be able to solve such 
a problem with a reasonable expenditure of computer time, or even to solve it at all! 
The decomposition principle (described in Selected References 2 and 3 at the end 
of the chapter) provides an effective way of exploiting the special structure. 

Conceptually, this streamlined version of the simplex method can be thought of 
as having each division solve its subproblem and sending this solution as its proposal 
to ‘‘headquarters’’ (the master problem), where negotiators then coordinate the pro- 
posals from all the divisions to find an optimal solution for the overall organization. 
If the subproblems are of manageable size and the master problem is not too large 
(not more than 50 to 100 constraints), this approach is’successful in solving some 
extremely large multidivisional problems. It is particularly worthwhile when the total 


Table 7.32 Table of Constraint Coefficients for Multidivisional Problems 


Coefficients of Decision Variables for 





Ist Division 2d Division Last Division 


Constraints on organizational 
resources needed by divisions 








Constraints on resources 
available only to ist division 


Constraints on resources 
available only to 2d division 


Constraints, on resources 
available only to last division 
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number of constraints is quite large (at least several hundred) and there are more than 
a few subproblems. 


Prototype Example 


The GOOD FOODS CORPORATION is a very large producer and distributor of food 
products. It has three main divisions: the Processed Foods Division, the Canned Foods 
Division, and the Frozen Foods Division. Because costs and market prices change 
frequently in the food industry, Good Foods periodically uses a corporate linear pro- 
gramming model to revise the production rates for its various products in order to use 
its available production capacities in the most profitable way. This model is similar 
to that for the Wyndor Glass Co. problem (see Sec. 3.1), but on a much larger scale, 
having hundreds of constraints and variables. (Since our space is limited, we shall 
describe a simplified version of this model that combines the products or resources 
by types.) 

The corporation grows its own high-quality corn and potatoes, and these basic 
food materials are the only ones currently in short supply that are used by all the 
divisions. Except for these organizational resources, each division uses only its own 
resources and thus could determine its optimal production rates autonomously. The 
data for each division and the corresponding subproblem involving just its products 
and resources are given in Table 7.33 (where Z represents profit in millions of dollars 
per month), along with the data for the organizational resources. 

The resulting linear programming problem for the corporation is 


Maximize Z = 8x, + 5x, + 6x3 + 9x4 + 7x; + 9x6 + 6x; + 5xg 


subject to 5x, + 3x, + 2x, + 3x6 + 4x; + 6x, = 30 
2x; + 4x, + 3x, + 7x5 + Xy = 20 
2x, + 4x, + 3x; = 10 
Tx + 3x, + 6x3 = 15 
5x, + 3x; = 12 
3x, + x; + 2x6 = 7 
2x4 + 4x5 + 3x6 = 9 
8x, + 5x, = 25 
7X, + 9x, = 30 
6x, + 4x, = 20 
and x; = 0, forj=1,2,...,8. 


Note how the corresponding table of constraint coefficients shown in Table 7.34 
fits the special structure for multidivisional problems given in Table 7.32. Therefore, 
the Good Foods Corp. can indeed solve this problem (or a more detailed version of 
it) by the streamlined version of the simplex method provided by the decomposition 
principle. 
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Table 7.33 Data for the Good Foods Corp. Multidivisional Problem 
Divisional Data Subproblem 


Processed Foods Division 
















Product Resource 
Usage/Unit Amount Maximize Z = 8x, + 5x, + 6x, 

Resource 1a a subject to 2x, + 4x, + 3x; = 10 

1 2 4 3 10 7x, + 3x, + 6x; = 15 

2 7 3 6 15 5x, + 3x3 = 12 

3 5 0 3 12 

and 420, 420, 420. 

AZ/unit |3 5 6 
Level I A Xs 


Canned Foods Division 


Resource 
Usage/Unit 











Amount Maximize Zy = 9x4 + Tx; + 9X6, 


Resource Available 

















subject to 3x, + x; + 2x7 
4 7 2x4 + 4x5 + 3x 39 
9 and x, = 0, x; 20, Xs = 0. 
X4 Xs X6 
Frozen Foods Division 
Resource 
Usage/Unit | amount Maximize Z, = Ox, + 5x, 
7 8 L Available subject to Bx, + Sx 5 25 
8 5 25 Tx; + 9xg = 30 
7 9 30 6x, + 4x, = 20 
6 4 20 
ae and %,=0, xg = 0. 
6 5 
x Xg 





Data for Organizational Resources 







Product 






Resource Usage /Unit 
4 5 


Amount 
Available 





Resource 







Potatoes 


Important Special Cases 


Some even simpler forms of the special structure exhibited in Table 7.32 arise quite 
frequently. Two particularly common forms are shown in Table 7.35. 

The first form occurs when some or all of the variables can be divided into 
groups such that the sum of the variables in each group must not exceed a specified 
upper bound for that group (or perhaps must equal a specified constant). Constraints 


Table 7.34 Table of Constraint 249 
Crp. AT E E Pokies penado 
* Programming 


Problems 








of this form, 
Xa bX tc + ys 
(or Xa t Poe Py 
usually are called either generalized upper bound constraints (GUB constraints for 


short) or group constraints. Although Table 7.35 shows each GUB constraint as 
involving consecutive variables, this is not necessary. For example, 


xX, + xs + XS 1 
is a GUB constraint, as is 
Xg + xX, + x = 20. 


The second form shown in Table 7.35 occurs when some or all of the individual 
variables must not exceed a specified upper bound for that variable. These constraints, 


x; S b, 
normally are referred to as upper bound constraints. For example, both 
xsl and x, = 5 


are upper bound constraints. 


Table 7.35 Table of Constraint Coefficients for Important Special Cases 
of the Structure for Multidivisional Problems Given in Table 7.32 


Generalized Upper Bounds Upper Bounds 
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Either GUB or upper bound constraints may occur because of the multidivisional 
nature of the problem. However, we should emphasize that they often arise in many 
other contexts as well. In fact, you already have seen several examples containing 
such constraints as summarized below. 

Note in Table 7.6 that all supply constraints in the transportation problem 
actually are GUB constraints. (Table 7.6 fits the form in Table 7.35 by placing the 
supply constraints below the demand constraints.) In addition, the demand constraints 
also are GUB constraints, but ones not involving consecutive variables. 

In the Southern Confederation of Kibbutzim regional planning problem (see 
Sec. 3.4), the constraints involving usable land for each kibbutz and total acreage for 
each crop all are GUB constraints. 

The technological limit constraints in the Nori & Leets Co. air pollution problem 
(see Sec. 3.4) are upper bound constraints, as are two of the three functional con- 
straints in the Wyndor Glass Co. product mix problem (see Sec. 3.1). 

Because of the prevalence of GUB and upper bound constraints, special tech- 
niques have been developed for streamlining the way in which the simplex method 
deals with them. (The technique for upper bound constraints is described in Sec. 9.1, 
and the one for GUB constraints! is quite similar.) If there are many such constraints, 
these techniques can drastically reduce the computation time for a problem. 


7.6 Conclusions 


The linear programming model encompasses a wide variety of specific types of prob- 
lems. The general simplex method is a powerful algorithm that can solve surprisingly 
large versions of any of these problems. However, some of these problem types have 
such simple formulations that they can be solved much more efficiently by streamlined 
versions of the simplex method that exploit their special structure. These streamlined 
versions can cut down tremendously on the computer time required for large problems, 
and they sometimes make it computationally feasible to solve huge problems. This is 
particularly true for transportation and transshipment problems, assignment problems, 
and problems with many upper bound or GUB constraints. For general multidivisional 
problems, the setup times are sufficiently large for the streamlined procedure that it 
should be used selectively only on large problems. 

We shall reexamine the special structure of the transportation, transshipment, 
and assignment problems in Sec. 10.6. There we shall see that these problems are 
special cases of an important class of linear programming problems known as the 
minimum cost flow problem. This problem has the interpretation of minimizing the 
cost for the flow of goods through a network. This network interpretation will add 
further insight into the structure of these three problems introduced in this chapter. 

Much research continues to be devoted to developing streamlined solution pro- 
cedures for special types of linear programming problems, including some not dis- 
cussed here. At the same time there is widespread interest in applying linear program- 
ming to optimize the operation of complicated large-scale systems, including social 
systems. The resulting formulations usually have special structures that can be ex- 


' Dantzig, George B., and Richard M. Van Slyke: ‘‘Generalized Upper Bounded Techniques for Linear 
Programming,” Journal of Computer and Systems Sciences, 1: 213-226, 1967. 


ploited. Recognizing and exploiting special structures has become a very important 
factor in the successful application of linear programming. 

We shall turn our attention in the next chapter to some other important consid- 
erations in applying linear programming. 
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PROBLEMS 


1.* A company has three plants producing a certain product that is to be shipped to four 
distribution centers. Plants 1, 2, and 3 produce 12, 17, and 11 shipments per month, respec- 
tively. Each distribution center needs to receive 10 shipments per month. The distance from 
each plant to the respective distributing centers is given in miles as follows: 


Distribution Center 
1 2 3 4 





1 800 1,300 400 700 
Plant 2 | 1,100 1,400 600 1,000 
3 600 1,200 800 900 





The freight cost for each shipment is $100 plus 50 cents/mile. 
How much should be shipped from each plant to each of the distribution centers to 
minimize the total shipping cost? 
(a) Formulate this problem as a transportation problem by constructing the appropriate 
cost and requirements table. 
(b) Use the northwest corner rule to obtain an initial basic feasible solution. 
(c) Starting with the initial basic feasible solution from part (b), use the transportation 
simplex method to obtain an optimal solution. 


2. Tom would like 3 pints of home brew today and an additional 4 pints of home brew 
tomorrow. Dick is willing to sell a maximum of 5 pints total at a price of $3.00/pint today 
and $2.70/pint tomorrow. Harry is willing to sell a maximum of 4 pints total at a price of 
$2.90/pint today and $2.80/pint tomorrow. 
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Tom wishes to know what his purchases should be to minimize his cost while satisfying 
his thirst requirements. 

(a) Formulate the linear programming model for this problem, and construct the initial 
simplex tableau (see Chaps. 3 and 4). 

(b) Formulate this problem as a transportation problem by constructing the appropriate 
cost and requirements table. 

(c) Starting with the northwest corner rule, use the transportation simplex method to 
solve the problem as formulated in part (b). 


3. A contractor has to haul gravel to three building sites. He can purchase as much as 
18 tons at a gravel pit in the north of the city and 14 tons at one in the south. He needs 10, 5, 
and 10 tons at sites 1, 2, and 3, respectively. The purchase price per ton at each gravel pit and 
the hauling cost per ton are given in the table below. 









Hauling Cost 


Per Ton Price 


Per Ton 





The contractor wishes to determine how much to haul from each pit to each site in order to 
minimize the total cost for purchasing and hauling gravel. 

(a) Formulate a linear programming model for this problem. Using the Big M method, 
construct the initial simplex tableau ready to apply the simplex method (but do not 
actually solve). 

(b) Now formulate this problem as a transportation problem by constructing the appro- 
priate cost and requirements table. Compare the size of this table (and the corre- 
sponding transportation simplex tableaux) used by the transportation simplex method 
with the size of the simplex tableaux from part (a) that would be needed by the 
simplex method. 

(c) The contractor notices that he can supply sites 1 and 2 completely from the north 
pit and site 3 completely from the south pit. Use the optimality test (but no iterations) 
of the transportation simplex method to check whether the corresponding basic 
feasible solution is optimal. 

(d) Starting with the northwest corner rule, use the transportation simplex method to 
solve the problem as formulated in part (b). 

(e) As usual, let c; denote the unit cost associated with source i and destination j as 
given in the cost and requirements table constructed in part (b). For the optimal 
solution obtained in part (d), suppose that the value of c; for each basic variable x; 
is fixed at the value given in the cost and requirements table, but that the value of 
cy for each nonbasic variable x, possibly can be altered through bargaining because 
the site manager wants to pick up the business. Use sensitivity analysis to determine 
the allowable range of values for each of the latter c; independently such that this 
optimal solution will remain optimal. 


4. A corporation has decided to produce three new products. Five branch plants now 
have excess product capacity. The unit manufacturing cost of the first product would be $31, 
$29, $32, $28, and $29,-in Plants 1, 2, 3, 4, and 5, respectively. The unit manufacturing cost 
of the second product would be $45, $41, $46, $42, and $43 in Plants 1, 2, 3, 4, and 5, 
respectively. The unit manufacturing cost of the third product would be $38, $35, and $40 in 
Plants 1, 2, and 3, respectively, whereas Plants 4 and 5 do not have the capability for producing 
this product. Sales forecasts indicate that 300, 500, and 400 units of products 1, 2, and 3, 


respectively, should be produced per day. Plants 1, 2, 3, 4, and 5 have the capacity to produce 
200, 300, 200, 300, and 500 units daily, respectively, regardless of the product or combinations 
of products involved. Assume that any plant having the capability and capacity to produce them 
can produce any combination of the products in any quantity. 
Management wishes to know how to allocate the new products to the plants to minimize 
total manufacturing cost. 
(a) Formulate this problem as a transportation problem by constructing the appropriate 
cost and requirements table. 
(b) Starting with Vogel’s approximation method, use the transportation simplex method 
to solve the problem as formulated in part (a). 


5.* Suppose that England, France, and Spain produce all the wheat, barley, and oats 
in the world. The world demand for wheat requires 125 million acres of land devoted to wheat 
production. Similarly, 60 million acres of land are required for barley and 75 million acres of 
land for oats. The total amount of land available for these purposes in England, France, and 
Spain is 70 million acres, 110 million acres, and 80 million acres, respectively. The number 
of hours of labor needed in England, France, and Spain to produce an acre of wheat is 18 
hours, 13 hours, and 16 hours, respectively. The number of hours of labor needed in England, 
France, and Spain to produce an acre of barley is 15 hours, 12 hours, and 12 hours, respectively. 
The number of hours of labor needed in England, France, and Spain to produce an acre of oats 
is 12 hours, 10 hours, and 16 hours, respectively. The labor cost per hour in producing wheat 
is $3.00, $2.40, and $3.30 in England, France, and Spain, respectively. The labor cost per 
hour in producing barley is $2.70, $3.00, and $2.80 in England, France, and Spain, respec- 
tively. The labor cost per hour in producing oats is $2.30, $2.50, and $2.10 in England, France, 
and Spain, respectively. The problem is to allocate land use in each country so as to meet the 
world food requirement and minimize the total labor cost. 

(a) Formulate this problem as a transportation problem by constructing the appropriate 

cost and requirements table. 

(b) Starting with the northwest corner rule, use the transportation simplex method to 

solve this problem. 


6. A firm producing a single product has three plants and four customers. The three 
plants will produce 6, 8, and 4 units, respectively, during the next time period. The firm has 
made a commitment to sell 4 units to customer 1, 6 units to customer 2, and at least 2 units 
to customer 3. Both customers 3 and 4 also want to buy as many of the remaining units as 
possible. The net profit associated with shipping a unit from plant i for sale to customer j is 
given by the following table: 





Customer 
1 2 3 4 
1 Ba Oy Bie 2 
Plant 2|5 2 1 3 
316 4 3 5 





Management wishes to know how many units to sell to customers 3 and 4 and how many units 
to ship from each of the plants to each of the customers to maximize profit. 
(a) Formulate this problem as a transportation problem by constructing the appropriate 
cost and requirements table. 
(b) Starting with Vogel’s approximation method, use the transportation simplex method 
to solve the problem as formulated in part (a). 


7. Plans need to be made for the energy systems for a new building. The three possible 
sources of energy are electricity, natural gas, and a solar heating unit. 
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Energy needs in the building are for electricity, water heating, and space heating, where 
the daily requirements (all measured in the same units) are 


Electricity 20 units 
Water heating 10 units 
Space heating 30 units. 


The size of the roof limits the solar heater to 30 units, but there is no limit to the electricity 
and natural gas available. Electricity needs can be met only by purchasing electricity (at a cost 
of $50 per unit). Both other energy needs can be met by any source or combination of sources. 
The unit costs are 


Water heating $90 $60 $30 
Space heating $80 $50 $40 


The objective is to minimize the total cost of meeting the energy needs. 

(a) Formulate this problem as a transportation problem by constructing the appropriate 
cost and requirements table. 

(b) Use the northwest corner rule to obtain an initial basic feasible solution for the 
problem as formulated in part (a). 

(c) Starting with the initial basic feasible solution from part (b), use the transportation 
simplex method to obtain an optimal solution. 

(d) Use Vogel’s approximation method to obtain an initial basic feasible solution for the 
problem as formulated in part (a). 

(e) Starting with the initial basic feasible solution from part (d), use the transportation 
simplex method to obtain an optimal solution. Compare the number of iterations 
required by the transportation simplex method here and in part (c). 





8. A company has two plants producing a certain product that is to be shipped to three 
distribution centers. The unit production costs are the same at the two plants, and the shipping 
cost (in hundreds of dollars) per unit of the product is shown for each combination of plant 
and distribution center as follows: 


Distribution Center 


1 2 3 
A/8 7 4 
Plant Bl6 8 5 


A total of 60 units is to be produced and shipped per week. Each plant can produce and ship 
any amount up to a maximum of 50 units per week, so there is considerable flexibility on how 
to divide the total production between the two plants so as to reduce shipping costs. 
Management’s objective is to determine how much should be produced at each plant, 
and then what the overall shipping pattern should be in order to minimize total shipping cost. 
(a) Assume that each distribution center must receive exactly 20 units per week. For- 
mulate this problem as a transportation problem by constructing the appropriate cost 
and requirements table. 
(b) Starting with the northwest corner rule, use the transportation simplex method to 
solve the problem as formulated in part (a). 
(c) Now assume that any distribution center may receive any quantity between 10 and 
30 units per week in order to further reduce total shipping cost, provided only that 
the total shipped to all three distribution centers must still equal 60 units per week. 
Formulate this problem as a transportation problem by constructing the appropriate 
cost and requirements table. 


(d) 
(e) 


(f) 


9. 


Starting with Vogel’s approximation method, use the transportation simplex method 
to solve the problem as formulated in part (c). 

Now assume that distribution centers 1, 2, and 3 must receive exactly 10, 20, and 
30 units per week, respectively. For administrative convenience, management has 
decided that each distribution center will be supplied totally by a single plant, so 
that one plant will supply one distribution center and the other plant will supply the 
other two distribution centers. The choice of these assignments of plants to distri- 
bution centers is to be made solely on the basis of minimizing total shipping cost. 
Formulate this problem as an assignment problem. 

Starting with Russell’s approximation method, use the transportation simplex method 
to solve the problem as formulated in part (e). 


Consider the prototype example for the transportation problem (the P & T Co. prob- 


lem) presented at the beginning of Sec. 7.1. Verify that the solution given there actually is 
optimal by applying just the optimality test portion of the transportation simplex method to this 


solution. 


10. 


Consider the transportation problem formulation of Option 1 for the Better Products 


Co. problem presented in Table 7.30. Verify that the optimal solution given in Sec. 7.4 actually 
is optimal by applying just the optimality test portion of the transportation simplex method to 
this solution. 


11. 


table: 






Consider the transportation problem having the following cost and requirements 


Destination 





Source 







1 


2 30 
3 30 
4D) 20 





Demand 





After several iterations of the transportation simplex method, the following transportation sim- 
plex tableau is obtained: 






















































Destination 
2 3 4 Supply | u; 
6 af 7 
= 
M | 3 | 4 
2 ame 
Source 3 ii] s| 6 
3 @) ©) 
0 | 0 | 0 
w © TO 
Demand 20 
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Continue the transportation simplex. method for two. more iterations. After two iterations, state 
whether the solution is optimal and, if so, why. 


12.* Consider the transportation problem having the following cost and requirements 


table: 






Destination 
1.2. $3 





4 





Source 






4 


6 
3.2 
8 5 








7 
4 
3 
3 


Demand 3 2° 2 


Use each of the following criteria to obtain an initial basic feasible solution. In each case apply 
the transportation simplex method, starting with this initial solution, to obtain an optimal 
solution. Compare the resulting number of iterations for the transportation simplex method. 

(a) Northwest corner rule. 

(b) Vogel’s approximation method: 

(c) Russell’s approximation method. 


13. Consider the transportation problem having the following cost and requirements 


table: 


Destination 


Pde a3 Supply 





Source 


PUN 


1 


Ao p 
aop 
ARE 
WANs] A 


1 
1 
1 





Demand 


(a) 


(b) 


(c) 
(d) 


an 
i 
m 
= 








Notice that this problem has three special characteristics: (1) number of sources = 
number of destinations, (2) each supply = 1, and (3) each demand = 1. Trans- 
portation problems with these characteristics are of à special type called the assign- 
ment problem (as described in Sec. 7.4). Use the integer solutions property to explain 
why this type of transportation problem can be interpreted as assigning sources to 
destinations on a one-to-one basis. 

How many basic variables are there in every basic feasible solution? How many of 
these are degenerate basic variables (= 0)? 

Use the northwest corner rule to obtain an initial basic feasible solution. 

Construct an initial basic feasible solution by applying the general procedure for the 
initialization step of the transportation simplex method. However, rather than using 
one of the three criteria for step 1 presented in Sec. 7.2, use the following criterion 
for selecting the next basic variable. 


Minimum cost criterion: From among the rows and columns still under consideration, 
select the variable x; having the smallest unit cost c; to be the next basic variable. (Ties 
may be broken arbitrarily.) 


(e) Starting with the initial basic feasible solution from part (c), use the transportation 


simplex method to obtain an optimal solution. 


14. Reconsider the transportation problem given in Prob. 13. Starting from a certain 
initial basic feasible solution, the transportation simplex method yields the following final 
transportation simplex tableau: 









Destination 


Py el. ey 


























Source g 























Demand i 1 1 1 








Z= 13 


Adapt the sensitivity analysis procedure for general linear programming presented in Secs. 6.6 
and 6.7 to independently investigate each of the two changes in the original model indicated 
below by deducing the resulting change or changes in the above final transportation simplex 
tableau. Use this approach to determine if the basic feasible solution in this tableau is still 
optimal. 
(a) Change c,, 
(b) Change c,; 


15. Consider the transportation problem having the following cost and requirements 


5: 
3. 


ll 
il 


7 to Cy 
1 to c3 








table: 
Destination 
1 
2 
Source 3 
4 
Demand 





(a) Use the northwest corner rule to construct an initial basic feasible solution. 
(b) Starting with the initial basic solution from part (a), use the transportation simplex 
method to obtain an optimal solution. 
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16. Consider the transportation. problem having the following cost. and- requirements 


table: 





Destination 
2 3 















(a) 
(b) 


(c) 
(d) 


5 
7 
3 
3 


Use Vogel’s approximation method to select the first basic variable for an initial 
basic feasible solution. 

Use Russell’s approximation method to select the first basic variable for an initial 
basic feasible solution. 

Use the northwest corner rule to construct a complete initial basic feasible solution. 
Starting with the initial basic feasible solution from part (c), use the transportation 
simplex method to obtain an optimal solution. 


17. Consider the transportation problem having the following cost and requirements 


table: 





Destination 
3 4 





Supply 











2 
4 
6 
f; 
0 
4 





Use each of the following criteria to obtain an initial basic feasible solution. Compare the values 
of the objective function for these solutions. 


Use the best of these solutions to initialize the transportation simplex method to 


Consider the transportation problem having the following cost and requirements 











(a) Northwest corner rule. 
(b) Vogel’s approximation method. 
(c) Russell’s approximation method. 
(a) 
obtain an optimal solution. 
18. 
table: 
Destination 
1 2 3 4 5 6 | Supply 
1 13 10 2 29 18 0 5 
2 |14 13 16 2 M 0 6 
Source 3 3 0 M 1l 6 0 7 
4 | 18 9 19 23 1 0 4 
5 | 30 24 34 36 28 0j] 3 
Demand 3 5 4 5 6 2 








Use each of the following criteria to obtain an initial basic feasible solution. Compare the values 
of the objective function for these solutions. 
(a) Northwest corner rule. 
(b) Vogel’s approximation method. 
(c) Russell’s approximation method. 
(d) Use the best of these solutions to initialize the transportation simplex method and 
then obtain the optimal solution. 


19.* Use the transportation simplex method to solve the Northern Airplane Co. pro- 
duction scheduling problem as it is formulated in Table 7.9. 


20. Consider the Northern Airplane Co. production scheduling problem presented in 
Sec. 7.1 (see Table 7.7). Formulate this problem as a general linear programming problem by 
letting the decision variables be x; = number of jet engines to be produced in month j 
(j = 1, 2, 3, 4). Construct the initial simplex tableau for this formulation, and then contrast 
the size (number of rows and columns) of this tableau and the transportation simplex tableaux 
for the transportation problem formulation of the problem (see Table 7.9). 


21. The BUILD-EM-FAST COMPANY has agreed to supply its best customer with 
three widgits during each of the next 3 weeks, even though producing them will require some 
overtime work. The relevant production data are as follows: 








Maximum Production, 
Regular Time 


Production Cost Per Unit, 
Regular Time 


$300 
$500 
$400 


Maximum Production, 
Overtime 



















The cost per unit produced with overtime for each week is $100 more than for regular time. 
The cost of storage is $50 per unit for each week it is stored. There is already an inventory of 
two widgits on hand currently, but the company does not want to retain any widgits in inventory 
after the 3 weeks. 

Management wants to know how many units should be produced in each week in order 
to maximize profit. 

(a) Formulate this problem as a transportation problem by constructing the appropriate 

cost and requirements table. 
(b) Use the transportation simplex method to solve this problem. 


22. Consider the transportation problem having the following cost and requirements 
table: 





Destination 








1 2 Supply 
1 8 5 
Source 2 6 4 
Demand 3 








(a) Using your choice of a criterion from Sec. 7.2 for obtaining the initial basic feasible 
solution, solve this problem manually by the transportation simplex method. (Keep 
track of your time.) 

(b) Reformulate this problem as a general linear programming problem, and then solve 
it manually by the simplex method. [Keep track of how long part (b) takes you, and 
contrast it with the computation time for part (a).] 
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23. Consider the general linear programming formulation of the transportation problem 
(see Table 7.6). Verify the claim in Sec. 7.2 that the set of (m + n) functional constraint 
equations (m supply constraints and n demand constraints) has one redundant equation; i.e., 
any one equation can be reproduced from a linear combination of the other (m + n — 1) 
equations. 


24.* Suppose that the air freight charge per ton between seven particular locations is 
given by the following table (except where no direct air freight service is available): 


ee ee ee 

Location 1 2 3 4 5 6 7 
1 — 21 50 62 93 7 — 
2 21 — 17 54 67 — 48 
3 50 17 — 60 98 67 25 
4 62 54 60 — 27 — 38 
5 93 67 98 27 — 47 42 
6 7 — 67 — 47 — 35 
7 — 48 25 38 42 35 — 





A certain corporation must ship a certain perishable commodity from locations 1-3 to 
locations 4-7. A total of 70, 80, and 50 tons of this commodity are to be sent from locations 
1, 2, and 3, respectively. A total of 30, 60, 50, and 60 tons are to be sent to locations 4, 5, 
6, and 7, respectively. Shipments can be sent through intermediate locations at a cost equal to 
the sum of the costs for each of the legs of the journey. The problem is to determine the 
shipping plan that minimizes the total freight cost. 

(a) Describe how this problem fits into the format of the general transshipment problem. 

(b) Reformulate this problem as an equivalent transportation problem by constructing 

the appropriate cost and requirements table. 

(c) Use Vogel’s approximation method to obtain an initial basic feasible solution for the 

problem formulated in part (b). Describe the corresponding shipping pattern. 

(d) Use the transportation simplex method to obtain an optimal solution for the problem 

formulated in part (b). Describe the corresponding optimal shipping pattern. 


25. Consider the airline company problem described in Prob. 2 at the end of Chap. 10. 

(a) Describe how this problem can be fitted into the format of the transshipment prob- 
lem. 

(b) Reformulate this problem as an equivalent transportation problem by constructing 
the appropriate cost and requirements table. 

(c) Use Vogel’s approximation method to obtain an initial basic feasible solution for the 
problem formulated in part (b). 

(d) Use the transportation simplex method to obtain an optimal solution for the problem 
formulated in part (b). 


26. A student about to enter college away from home has decided that she will need an 
automobile during the next 4 years. But since funds are going to be very limited, she wants to 
do this in the cheapest possible way. However, considering both the initial purchase price and 
the operating and maintenance costs, it is not clear whether she should purchase a very old car 
or just a moderately old car. Furthermore, it is not clear whether she should plan to trade in 
her car at least once during the 4 years, before the costs become too high. 

The relevant data each time she purchases a car are: 





Trade-in Value at End 
of Ownership Year 


1 2 3 4 
$700 $500 $400 $300 


Operating and Maintenance 
Purchase Costs for Ownership Year 


Price 1 2 3 4 
Very old car $1,200 $1,900 $2,200 $2,500 $2,800 


Moderately 
old car $4,500 | $1,000 $1,300 $1,700 $2,300 | $2,500 $1,800 $1,300 $1,000 
















If the student trades in a car during the next 4 years, she would do it at the end of a year 
(during the summer) on another car of one of these two kinds. She definitely plans to trade in 
her car at the end of the 4 years on a much newer model. However, she needs to determine 
which plan for purchasing and (perhaps) trading in cars during the 4 years would minimize the 
total net cost for the 4 years. 
(a) Describe how this problem can be fitted into the format of the transshipment prob- 
lem. 


(b) Reformulate this problem as an equivalent transportation problem by constructing 
the appropriate cost and requirements table. 

(c) Use Russell’s approximation method to obtain an initial basic feasible solution for 
the problem as formulated in part (b). 

(d) Use the transportation simplex method to obtain an optimal solution for the problem 
formulated in part (b). 


27. Without using x; variables to introduce fictional shipments from a location to itself, 
formulate the linear programming model for the general transshipment problem described at 
the end of Sec. 7.3. Identify the special structure of this model by constructing its table of 
constraint coefficients (similar to Table 7.6) that shows the location and values of the nonzero 
coefficients. 


28. Four cargo ships will be used for shipping goods from one port to four other ports 
(labeled 1, 2, 3, 4). Any ship can be used for making any one of these four trips. However, 
because of differences in the ships and cargoes, the total cost of loading, transporting, and 
unloading the goods for the different ship—port combinations varies considerably, as shown in 
the following table: 


Ship 


APUN 





The objective is to assign the ships to ports on a one-to-one basis in such a way as to minimize 
the total cost for all four shipments. 
(a) Describe how this problem fits into the general format for the assignment problem. 
(b) Reformulate this problem as an equivalent transportation problem by constructing 
the appropriate cost and requirements table. 
(c) Use the northwest corner rule to obtain an initial basic feasible solution for the 
problem as formulated in part (b). 
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(d) Starting with the initial basic feasible solution from part (c), use the transportation 
simplex method to obtain an optimal set of assignments for the original problem. 

(e) Are there other optimal solutions in addition to the one obtained in part (d)? If so, 
use the transportation simplex method to identify them. 


29. Reconsider Prob. 4. Suppose that the sales forecasts have been revised downward 
to 240, 400, and 320 units per day of products 1, 2, and 3, respectively. Thus each plant now 
has the capacity to produce all that is required of any one product. Therefore, management has 
decided that each new product should be assigned to only one plant and that no plant should 
be assigned more than one product (so that three plants are each to be assigned one product, 
and two plants are to be assigned none). The objective is to make these assignments so as to 
minimize the total cost of producing these amounts of the three products. 

(a) Formulate this problem as an assignment problem by constructing the appropriate 

cost table. 

(b) Reformulate this assignment problem as an equivalent transportation problem by 

constructing the appropriate cost and requirements table. 

(c) Starting with Vogel’s approximation method, use the transportation simplex method 

to solve the problem as formulated in part (b). 


30.* The coach of a certain swim team needs to assign swimmers to a 200-yard medley 
relay team to send to the Junior Olympics. Since most of his best swimmers are very fast in 
more than one stroke, it is not clear which swimmer should be assigned to each of the four 
strokes. The five fastest swimmers and the best times (in seconds) they have achieved in each 
of the strokes (for 50 yards) are 


Stroke Carl Chris David Tony Ken 





Backstroke 37.7 32.9 33.8 37.0 35.4 
Breaststroke | 43.4 33.1 42.2 34.7 41.8 
Butterfly 33.3 28.5 38.9 30.4 33.6 
Freestyle 29.2 26.4 29.6 28.5 31.1 





The coach wishes to determine how. to assign four swimmers to the four different strokes to 
minimize the sum of the corresponding. best times. 
(a) Formulate this problem as an assignment problem. 
(b) Reformulate this assignment problem as an equivalent transportation problem by 
constructing the appropriate cost and requirements table. 
(c) Starting with Vogel’s approximation method, use the transportation simplex method 
to solve the problem as formulated in part (b). 


31. Consider the assignment problem formulation of Option 2 for the Better Products 
Co. problem presented in Table 7.31. 

(a) Reformulate this problem as an equivalent transportation problem with three sources 
and five destinations by constructing the appropriate cost and requirements table. 

(b) Convert the optimal solution given in Sec. 7.4 for this assignment problem into a 
complete basic feasible solution (including degenerate basic variables) for the trans- 
portation problem formulated in part (a). Specifically, apply the ‘‘General Procedure 
for Constructing an Initial Basic Feasible Solution’ given in Sec. 7.2. For each 
iteration of the procedure, rather than using any of the three alternative criteria 
presented for step 1, select the next basic variable to correspond to the next assign- 
ment of a plant to a product given in the optimal solution. When only one row or 
only one column remains under consideration, use step 4 to select the remaining 
basic variables. 

(c) Verify that the optimal solution given in Sec. 7.4 for this assignment problem 


actually is optimal by applying just the optimality test portion of the transportation 
simplex method to the complete basic feasible solution obtained in part (b). 

(d) Use the northwest corner rule to obtain an initial basic feasible solution for the 
problem as formulated in part (a). 

(e) Starting with the initial basic feasible solution from part (d), use the transportation 
simplex method to obtain an optimal solution for the problem as formulated in part 
(a). Compare this optimal basic feasible solution with the one obtained in part (b). 

(f) Now reformulate this assignment problem as an equivalent transportation problem 
with five sources and five destinations by constructing the appropriate cost and 
requirements table. Compare this transportation problem with the one formulated in 
part (a). 

(g) Repeat part (b) for the problem as formulated in part (f). Compare the basic feasible 
solution obtained with the one from part (c). 


32. Starting with Vogel’s approximation method, use the transportation simplex method 
to solve the Job Shop Co. assignment problem as formulated in Table 7.23. (As stated in Sec. 
7.4, the resulting optimal solution has basic variables x,, = 0, x;4 = 1, x23 1, x3, 1, 











Xa = 0, 2X = 1, X43 = 0.) " 


33. Reconsider Prob. 3. Now suppose that trucks (and their drivers) need to be hired 
to do the hauling, where each truck can only be used to haul gravel from a single pit to a single 
site. Each truck can haul 5 tons (and costs five times the hauling cost per ton given earlier). 
Only full trucks would be used to supply each site. 

(a) Formulate this problem as an assignment problem by constructing the appropriate 

cost table, including identifying the assignees and assignments. 

(b) Reformulate this assignment problem as an equivalent transportation problem with 
two sources and three destinations by constructing the appropriate cost and require- 
ments table. 

(c) Starting with the northwest corner rule, use the transportation simplex method to 
solve the problem as formulated in part (b). 


34. Consider the transportation problem formulation and solution of the Metro Water 
District problem presented in Secs. 7.1 and 7.2 (see Tables 7.12 and 7.23). Adapt the sensitivity 
analysis procedure presented in Sec. 6.6 to conduct sensitivity analysis on this problem by 
independently investigating each of the following four changes in the original model. For each 
change, revise the final transportation simplex tableau as needed for identifying and evaluating 
the current basic solution. Then test this solution for feasibility and for optimality. (Do not 
reoptimize.) 

(a) Change c34 from 23 to c34 = 20. 

(b) Change c», from 13 to ca, = 16. 

(c) Decrease the supply from source 2 to 50 and decrease the demand at destination 5 

to 50. 
(d) Increase the supply at source 2 to 80 and increase the demand at destination 2 
to 40. 


35. Consider the assignment problem having the following cost table: 


Job i 
1 2. 3 2 
1/M 8 7 
Person 2 7 6 4 
3D |0 0 0 


(a) Reformulate this problem as an equivalent transportation problem by constructing 
the appropriate cost and requirements table. 
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(b) Use Vogel’s approximation method to obtain an initial basic feasible solution for the 


problem as formulated in part (a). 








(c) Starting with the initial basic feasible solution from part (b), use the transportation 
simplex method to obtain an optimal solution for the problem as formulated in 
part (a). 

36. Consider the assignment problem having the following cost table: 

Assignment 
1 2 3° 4 
A 
Assignee R 
D 

(a). Reformulate this problem as an equivalent transportation problem by constructing 
the appropriate cost and requirements table. 

(b) Use the northwest corner rule to obtain an initial. basic feasible solution for the 
problem as formulated in part (a). 

(c) Starting with the initial basic feasible solution from part (b), use the transportation 
simplex method to obtain an optimal solution for the problem as formulated in 
part (a). 

37. Consider the assignment problem having the following cost table: 

Assignment 
1 2 3 4 
A 
Assignee a 
D 

(a) Reformulate this problem as an equivalent transportation problem by constructing 
the appropriate cost and requirements table. 

(b) Use the northwest corner rule to obtain an initial basic feasible solution for the 
problem as formulated in part (a). 

(c) Starting with the initial basic feasible solution from part (b), use the transportation 
simplex method to obtain an optimal solution for the problem formulated in part (a). 

38. Consider the linear programming model for the general assignment problem given 


in Sec. 7.4. Construct the table of constraint coefficients for this model. Compare this table 
with the one for the general transportation problem (Table 7.6). In what ways does the general 
assignment problem have more special structure than the general transportation problem? 


39.* Describe how the Wyndor Glass Co. problem formulated in Sec. 3.1 can be in- 
terpreted as a multidivisional linear programming problem. Identify the variables and constraints 
for the master problem and each subproblem. 


40. 


Consider the following linear programming problem. 


Maximize Z = 2x, + 4x, + 3x3 + 2x4 — 5x; + 3X6 265 


subject to 3x, + 2x, + 3x; = 30 Special Types of Linear 
Programming 
2x3 — XS 20 Problems 


5x, — 2x, + 3x3 + 4x4 + 2x5 + x= 20 
3x, 15 
2x5 + 3x = 40 
5x, — x, = 30 
2x, + 4x, + 2x4 + 3x6 = 60 
—x, + 2x, + x; 2 20 
and x, 20, for poe We. Dye 2 25 6: 


(a) Rewrite this problem in a form that demonstrates that it possesses the special struc- 
ture for multidivisional problems. Identify the variables and constraints for the master 
problem and each subproblem. 

(b) Construct the corresponding table of constraint coefficients having the block angular 
structure shown in Table 7.32. (Include only nonzero coefficients, and draw a box 
around each block of these coefficients to emphasize this structure.) 


41. Consider the following table of constraint coefficients for a linear programming 
problem: 


Coefficient of 

Constraint XX X3 ua 5 Xs Xs 
1 1 1 | 
2 1 
3 4 3 =2 4 l 
4 2 4 
5 ] 1 
6 5 3 1 -2 4 
7 i 
8 2 ji 3 
9 2 4 


(a) Show how this table can be converted into the block angular structure for multidi- 
visional linear programming as shown in Table 7.32 (with three subproblems in this 
case) by reordering the variables and constraints appropriately. 

(b) Identify the upper bound constraints and GUB constraints for this problem. 


42. A corporation has two divisions (the Eastern Division and the Western Division) 
that operate semiautonomously, with each developing and marketing its own products. How- 
ever, to coordinate their product lines and to promote efficiency, the divisions compete at the 
corporate level for investment funds for new product development projects. In particular, each 
division submits its proposals to corporate headquarters in September for new major projects 
to be undertaken the following year, and available funds are then allocated in such a way as 
to maximize the estimated total net discounted profits that will eventually result from the 
projects. 

For the upcoming year, each division is proposing three new major projects. Each project 
can be undertaken at any level, where the estimated net discounted profit would be proportional 
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to the level. The relevant data on the projects are summarized as follows: 








Eastern Division Western Division 
Project Project 
1 2 3 1 2 3 
xy X X3 Xa Xs Xe 


Level 
Required investment 

Gn millions of dollars) 
Net profitability 
Facility restriction 
Labor restriction 





16x, 7a B% 


7x, 3x, 5x3 
10x, + 3x, + 7x; = 50 
4x, + 2x, + 5x; = 30 





8x, 20x, 10x 


4x, Txs 5x6 
6x, + 13x5 + 9x = 45 
3xy + 8x5 + 2x; = 25 





A total of $150,000,000 is budgeted for investment in these projects. 


(a) Formulate this problem as a multidivisional linear programming problem. 
(b) Construct the corresponding table of constraint coefficients having the block angular 


structure shown in Table 7.32. 





Formulating Linear 
Programming Models, 
Including Goal 
Programming 


Chapter 3 introduced the general nature of linear programming problems, and Chaps. 
4, 5, and 6 described how to solve and analyze them. Then Chap. 7 discussed some 
particularly important special types of linear programming problems. However, these 
chapters have presented only a portion of the story. The most successful users of 
linear programming report that one of the most crucial areas of their work is building 
the model. Many of the most noteworthy applications of linear programming involve 
problems whose natural formulation does not even resemble a linear programming 
model. It is only through some relatively sophisticated formulation techniques that 
the problems can be reformulated to fit linear programming and its exceptionally 
powerful solution procedures. To provide you with a more complete perspective about 
the application of linear programming, this chapter focuses on describing and illus- 
trating some of the most useful formulation techniques. 

The first section describes how to deal with variables and linear functions that 
can be either positive or negative but with different unit costs. This description leads 
into the key topic of goal programming (Sec. 8.2), where the single objective that is 
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characteristic of linear programming is replaced by several goals toward which we 
must strive simultaneously. The formulation technique of Sec. 8.1, however, enables 
us to convert such a problem back into the linear programming format. Section 8.3 
deals with a fairly similar problem, where there are several objective functions and 
the one with the smallest value is to be maximized. Another formulation technique is 
introduced to show us how to restore the linear programming format in this case. 

All three of these sections also illustrate an additional, widely used formulation 
technique, namely, the introduction of auxiliary variables. In contrast to decision 
variables, auxiliary variables do not represent the decisions to be made. Instead, 
auxiliary variables simply are extra variables that are helpful for formulating the 
model. This technique arises again in Sec. 8.4, which presents some examples of 
relatively difficult formulations. Section 8.5 then concludes with a case study (school 
rezoning to achieve racial balance) -that pulls together some of the key ideas from this 
chapter and the preceding ones. 


8.1 Variables or Linear Functions with Positive and 
Negative Components 


Variables with Positive and Negative Components 


As we discussed at the end of Sec. 4.6, it sometimes is necessary to deal with variables 
that are allowed to be either positive or negative. When there is no bound on the 
negative values allowed, each such variable (say, x;) can be replaced throughout the 
model by the difference of two new nonnegative variables (say, x$ and x7), so that 
= yt y + - 
x% = x} Xj, where x; 20, x7 20. 


We interpreted x} as representing the positive component of xX, and x; as its negative 
component. In particular, 


ens +x; if x; 20 
J 0 if x; =0, 
yna 0 if x, = 0 
i =x; if x, s 0, 


for all basic feasible solutions, because such solutions necessarily have the property 
that either xF = 0 orx; = 0 (or both). Therefore, when the simplex method is 
applied to the model after substituting (xj — x7) for x,, it never will have both x} 


and x; as basic variables at the same time. (We shall continue to use this notation 
with plus and minus superscripts throughout the chapter to represent the positive and 
negative components of any quantity, regardless of whether the quantity is the value 
of a variable or a function.) 

The effect of the choice of value for x; may be quite different for positive and 
negative values. For example, suppose that x; represents the inventory level of a 
particular product. If x; > 0 (so ay > 0 and x; = 0), the costs incurred include 
storage expenses and interest charges on the capital tied up in this inventory. On the 
other hand, x; < 0 (sox; > 0 and x$ = 0) means that a shortage of x; has occurred. 
The costs in this case result from lost sales, both now (if customers won’t wait) and 
in the future (disgruntled customers won’t return). Because of this difference between 


Cost 








Negative inventory level (shortages) Inventory level 





—5 r 


Figure 8.1 Illustration of inventory cost violating the proportionality assumption of linear programming. 


the positive and negative cases, the cost of x; is not simply proportional to x;, so the 
proportionality assumption of linear programming (discussed in Sec. 3.3) is violated 
for this example. The violation of the proportionality assumption is illustrated in Fig. 
8.1, where, instead of a single straight line passing through the origin (the propor- 
tionality assumption), the unit cost of holding inventory (positive x;) per unit time is 
$2, whereas the unit cost of shortages (negative x;) per unit time is $3 (instead of 
— $2). 

Fortunately, as long as the proportionality assumption holds for the positive and 
negative cases considered separately, the objective function can be reformulated in a 
linear programming format by using x$ and x; . Let 


Z; = contribution of x; to the objective function Z. 
For appropriate constants, cj and c7, 
+ 
. CTX, for x, = 0 
if Z=; id J : then Z, = ctx} + c7x7. 
f fa (=x) forx, = 0, J sed pus 


For example, in Fig. 8.1, cj = 2 and cj = 3, which yields 


7: = 


i = 
7 {3% lora =N; so that Z,= 2x} + 3x7. 


3(—x;) for x, 0, J 


The one restriction on the use of this technique is that c} and c; must satisfy 
the following relationship: 


ete, = 0 when minimizing Z, 
c; +c; £0 when maximizing Z. 


[When this relationship does not hold, the Z, = cjx} + cjxj contribution to Z 
would create an unbounded Z in the favorable direction simply by adding the same 
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large positiye number (unbounded in size) to both x} and x; . Adding the same number 
to x} and x; does not change the value of x, = x} — xj.] 

Because it arises with some frequency in practical applications, an important 
special case of this technique is where c} = c} (call this common value c;), so that 
Z, is simply proportional to the absolute value of x, |x|. To satisfy the preceding 
restriction on c7 and c7, assume that c; = 0 when minimizing Z or c; = 0 when 
maximizing Z. Note that 


lx] = xf + x7. 
Therefore, if Z; = cxi, then Z, = cxf + x7). 
To contrast this case with the one considered at the end of Sec. 4.6 where the pro- 
portionality assumption is satisfied, 


: _ = eee 
if Z, = c;x;, then Z = C(x; x7). 


Linear Functions with Positive and Negative Components 


The measure of performance also can behave as illustrated in Fig. 8.1 when the 
abscissa value is given by a linear function instead of a single variable. In fact, the 
same example of inventory level frequently arises naturally in the model as a linear 
function of the decision variables. You will see this occur in the second example of 
Sec. 8.4, where the decision variables for each time period j (or t in Sec. 8.4) are the 
production level P, and work force level W,. However, it is necessary to incorporate 
the inventory level into the model in order to include the inventory costs in the 
objective function. To lay the groundwork for doing this incorporation, we introduce 
an auxiliary variable x, (or J; in Sec. 8.4) for each time period j to represent the 
inventory level at the end of the period; then we express this variable as a linear 
function of the appropriate decision variables. In this case, 


where S; is the forecasted sales level (a given constant) for time period j. 


How is this linear function, x;_, + P, — S;, incorporated into the model? If 


we use the notation with + and — as superscripts introduced at the beginning of the 
section, (x; + P, — S) + and (x;_, + P; — S;)~ represent the positive and negative 
components, respectively, of this function. In other words, 


er if (@j_, + P- S) =0 
0 if (j_, + P — S) <0, 


(j_) + P,- S)* 


(%j-1 + P; a S) = 


0 if œ, + P; — $,) 20 
(oe + P,- S) if œ- + P,- S8) =0. 
Now we introduce the additional auxiliary variables, xj and xj , defined as 
xt = œ + P- S), 

x = jay tPS). 


If we define cj and c} as in the preceding subsection (and with the same restriction 
on their values), the contribution of the inventory cost in period j to the objective 
function again is 


However, the crucial difference from the preceding subsection is that since the x; = 
+ 


x = x} variables are not the decision variables included in the original model, the 
definitions of x; and x; must be incorporated directly into the linear programming 
model. It is not enough to simply record the definitions, as we just did, because the 
simplex method considers only the objective function and constraints that constitute 
the model. Since 


= = = xt = x7 j 
Xj = Xp + P; S; and x,= x7 - x; for each j, 


xj and x7 can be incorporated directly by adding the equality constraints, 


t x7 =xt,—-x7 een Ne j 
xF deg = xja T Xj t Po S; for each j, 


to the model. (The variables on the right-hand side of these constraints should be 
moved to the left-hand side for proper form.) These additional constraints ensure that 
x; and x; will take on appropriate values, given the values assigned to the decision 
variables by the simplex method. 

This technique of introducing auxiliary variables and then using equality con- 
straints to define them in the model is a very common one in a variety of applications. 

You will see the above inventory example of this formulation technique worked 
out in the context of a complete model in Sec. 8.4. 

Perhaps the most important application of this technique is to goal programming, 
which is described next. 


8.2 Goal Programming 


We have assumed throughout the preceding chapters that the objectives of the or- 
ganization conducting the linear programming study can be encompassed within a 
single overriding objective, such as maximizing total profit or minimizing total cost. 
However, this assumption is not always realistic. In fact, as we discussed in Sec. 2.1, 
studies have found that the management of American corporations frequently focuses 
on a variety of other objectives—e.g., to maintain stable profits, increase (or maintain) 
one’s share of the market, diversify products, maintain stable prices, improve worker 
morale, maintain family control of the business, and increase company prestige. Goal 
programming provides a way of striving toward several such objectives simultane- 
ously. 

The basic approach of goal programming is to establish a specific numeric 
goal for each of the objectives, formulate an objective function for each objective, 
and then seek a solution that minimizes the (weighted) sum of deviations of these 
objective functions from their respective goals. 

There are two cases to be considered. One, called nonpreemptive goal pro- 
gramming, is where all of the goals are of roughly comparable importance. The 
other, called preemptive goal programming, is where there is a hierarchy of priority 
levels for the goals, so that the goals of primary importance receive first-priority 
attention, those of secondary importance receive second-priority attention, and so forth 
(if there are more than two priority levels). 

We begin with an example that illustrates the basic features of nonpreemptive 
goal programming and then discuss the preemptive case. 
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Prototype Example for Nonpreemptive Goal Programming 


The DEWRIGHT COMPANY is considering three new products to replace current 
models that are being discontinued, so their OR Department has been assigned the 
task of determining which mix of these products should be produced. Management 
wants primary consideration given to three factors: long-run profit, stability in the 
work force, and the level of capital investment that would be required now for new 
equipment. In particular, they have established the goals of (1) achieving a long-run 
profit (net present value) of at least $125,000,000 from these products, (2) maintaining 
the current employment level of 4,000 employees, and (3) holding the capital invest- 
ment to less than $55,000,000. However, management realizes that it probably won’t 
be possible to attain all of these goals simultaneously, so they have discussed their 
priorities with the OR Department. This discussion has led to setting penalty weights 
of 5 for missing the profit goal (per million dollars under), 2 for going over the 
employment. goal (per hundred employees), 4 for going under this same goal, and 3 
for exceeding the capital investment goal (per million dollars over). Each new prod- 
uct’s contribution to profit, employment level, and capital investment level is pro- 
portional to the rate of production. These contributions per unit rate of production are 
shown in Table 8.1, along with the goals and penalty weights. 


FORMULATION: The Dewright Company problem includes all three possible types 
of goals: a lower, one-sided goal (long-run profit); a two-sided goal (employment 
level); and an upper, one-sided goal (capital investment). Letting the decision vari- 
ables x,, x, x3 be the production rates of products 1, 2, and 3, respectively, these 
goals can be stated as 


12x, + 9x, + 15x32 125 (Profit goal) 
5x, + 3x, + 4x, = 40 (Employment goal) 
5x, + 7x, + 8x3= 55 (Investment goal). 


Note that these three relationships are not constraints. It is not even expected 
that all of them can be satisfied simultaneously. The right-hand sides are not fixed 
constants with no flexibility. Instead, they are managerial goals to be approached as 
closely as possible. More precisely, given the penalty weights in the last column of 
Table 8.1, the overall objective becomes 


Minimize Z= S(12x, + 9x, + 15x; — 125)~ 


+2( 5x, + 3x. + 4%, — 40)* 


Table 8.1 Data for Dewright Co. Nonpreemptive Goal Programming Problem 










Unit Contribution 


Product Penalty 


Factor 













5 
X(+), 4(—) 
3 


Long-run profit 
Employment level 
Capital investment 


2125 (millions of dollars) 
= 40 (hundreds of employees) 
= 55 (millions of dollars) 





+4( 5x, + 3x, + 4x3 — 40)7 


+3( 5x, + 7x, + 8x, — 55)*. 


This expression uses the notation of + and — as superscripts introduced in Sec. 8.1 
to represent the positive and negative components of the function inside the parenthe- 
ses. For example, consider the following two cases involving the employment level, 
5x, + 3x, + 443, relative to its goal of 40. 


(1) If (5x, + 3x, + 4x, — 40) = +10, 


then (5x, + 3x, + 4x, — 40)* = +10, 
(5x, + 3x. + 4x; — 40)7 = 
(2) ‘Ie (Sx, + 3x, + 4x; ~ 40) = —5, 
then (5x, + 3x, + 4x; — 40)* = 0, 


(5x, + 3x, + 4x, — 40)7 = +5. 


Thus, in the first case, the contribution of the second and third terms to Z is 2(10) + 
4(0) = 20, and in the second case it is 2(0) + 4(5) = 20 again. Because management 
considers overshooting the employment level goal per unit to be half as serious as 
undershooting this goal per unit (penalty weights of 2 versus 4), overshooting by 10 
provides the same total penalty as undershooting by 5. 

Unfortunately, Z is not a linear function, because each of the four terms has the 
nonlinear form illustrated in Fig. 8.1 (with a zero slope on one side of the origin), 
where the value of the abscissa is given by the linear function inside the parentheses. 
Therefore, the simplex method cannot be applied to solve the model in this form. 
However, it can be applied after the model is reformulated to fit the linear program- 
ming format. Reformulating requires using the formulation technique presented in the 
preceding section. 

In particular, the first step is to introduce the new auxiliary variables, 

yı = 12x, + 9x, + 15x, — 125, 
yo = 5x, + 3x. + 4x, — 40, 
ys = 5x, + 7x, + 8x, — 55, 


as well as their positive and negative components, 


W=YT- Yi: where y}{2=0,y]=0, 
Yo = YR - Yds where y$ 20, yz = 0, 
y =y} — V3, where y$ 2=0,y3 20. 


Because there is no penalty for exceeding the profit goal of 125 or being under 
the investment goal of 55, neither yf nor y3 should appear in the objective function 
representing the total penalty for deviations from the goals. However, it is possible 
(and even desirable) to have yj > 0 and y3 > 0, so both of these variables should 
appear (along with y7, y$, y3, and y3) in the equality constraints that define the 
relationship between these six auxiliary variables and the three original decision vari- 
ables (x,, X2, x3). Using the penalty weights shown in Table 8.1 then leads to the 
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following linear programming formulation of this goal programming problem: 
Minimize Z = 5yļ + 2y% + 4y3 + 3y3, 
subject to 12x, + 9x, + 15x, — (yi — yī) = 125 
5x, + 3x, + 4x, —(y3 — y3) = 40 
Sx, + 7x + 8x3 — (y$ — ¥3) = 55 
and BE y} =0, eee OH 1,2,3;k = 1,2,3). 


(If the original problem had any actual linear programming constraints, such as con- 
straints on fixed amounts of certain resources being available, these would be included 
in the model.) 

Applying the simplex method to this formulation yields an optimal solution, 
x = Bx = 0, x = 3, with yt = 0, y7 = 0,y} = Boyz = 0, y$ = 0, 
y3 = 0. Therefore, y, = 0, y. = 3, y3 = 0, so the first and third goals are fully 
satisfied, but the employment level goal of 40 is exceeded by 83 (833 employees). 
The resulting penalty for deviating from the goals is Z = 163. 


Preemptive Goal Programming 


The preceding example assumes that all of the goals are of roughly comparable im- 
portance. Now consider the case of preemptive goal programming, where there is a 
hierarchy of priority levels for the goals. Such a case arisés when one or more of the 
goals clearly is far more important than the others. Thus the initial focus should be 
on achieving as closely as possible these first-priority goals. The other goals also 
might naturally divide further into second-priority goals, third-priority goals, and so 
on. After we find an optimal solution with respect to the first-priority goals, we can 
break any ties for the optimal solution by considering the second-priority goals.. Any 
ties that remain after this reoptimization can be broken by considering the third-priority 
goals, and so on. 

When we deal with goals on the same priority level, our approach is just like 
the one described for nonpreemptive goal programming. Any of the same three types 
of goals (lower one-sided, two-sided, upper one-sided) can arise. Different penalty 
weights for deviations from different goals still can be included, if desired. The 
formulation technique of Sec. 8.1 again is used to reformulate this portion of the 
problem to fit the linear programming format. 

One way of solving the overall problem is to solve a sequence of linear pro- 
gramming problems. We shall call this procedure the sequential procedure. 

At the first stage of the sequential procedure, the only goals included in the 
linear programming model are the first-priority goals, and the simplex method is 
applied in the usual way. If the resulting optimal solution is unique, we adopt it 
immediately without considering any additional goals. 

However, if there are multiple optimal solutions with the same optimal value of 
Z (call it Z*), we move to the second stage by adding the second-priority goals to the 
model. If Z* = 0, the auxiliary variables representing the deviations from first-priority 
goals now can be completely deleted from the model, where the equality constraints 
that contain these variables are replaced by the mathematical expressions (inequalities 
or equations) for these goals to ensure that they continue to be fully achieved. On the 


other hand, if Z* > 0, the second-stage model simply adds the second-priority goals 
to the first-stage model (as if these additional goals actually were first-priority goals), 
but then it also adds the constraint that the first-stage objective function must equal 
Z* (which enables us again to delete the terms involving first-priority goals from the 
second-stage objective function). After we apply the simplex method again, we repeat 
the same process for any lower-priority goals. 

It also is possible to duplicate the work of the sequential procedure with just 
one run of the simplex method if a slight modification is first made in the algorithm. 
We shall call this procedure the streamlined procedure. 

If there are just two priority levels, the modification for the streamlined proce- 
dure is one you already have seen, namely, the form of the Big M method illustrated 
throughout Sec. 4.6. In this form, instead of replacing M throughout the model by 
some huge positive number before running the simplex method, we retain the symbolic 
quantity M in the sequence of simplex tableaux. Each coefficient in row 0 (for each 
iteration) is some linear function, aM + b, where a is the current multiplicative factor 
and b is the current additive factor. The usual decisions based on these coefficients 
(entering basic variable and optimality test) now are based solely on the multiplicative 
factors, except that any ties would be broken by using the additive factors. 

The linear programming formulation for the streamlined procedure with two 
priority levels would include all of the goals in the model in the usual manner, but 
with basic penalty weights of M and 1 assigned to deviations from first-priority and 
second-priority goals, respectively. If different penalty weights are desired within the 
same priority level, these basic penalty weights then are multiplied by the individual 
penalty weights assigned within the level. 

When there are more than two priority levels (say, p of them), the streamlined 
procedure generalizes in a straightforward way. The basic penalty weights for the 


respective levels now are M,, M3, . . . , M,_;, 1, where M, represents a number that 
is vastly larger than M,, M, is vastly larger than M3, . . . , and M,_, is vastly larger 


than 1. Each coefficient in row 0 of each simplex tableau is now a linear function of 
all of these quantities, where the multiplicative factor of M, is used to make the 
necessary decisions, with tiebreakers beginning with the multiplicative factor of M, 
and ending with the additive factor. 

We shall now illustrate both the sequential procedure and the streamlined pro- 
cedure by modifying the Dewright Company problem. 


Example for Preemptive Goal Programming 


Faced with the unpleasant recommendation to increase the company’s work force by 
more than 20 percent, the management of the DEWRIGHT COMPANY has recon- 
sidered the original formulation of the problem that was summarized in Table 8.1. 
This increase in the work force probably would be a rather temporary one, so the very 
high cost of training 833 new employees would be largely wasted, and the large 
(undoubtedly well-publicized) layoffs would make it more difficult for the company 
to attract high-quality employees in the future. Consequently, management has con- 
cluded that a very high priority should be placed on avoiding an increase in the work 
force. Furthermore, management has learned that raising more than $55,000,000 for 
capital investment for the new products would be extremely difficult, so a very high 
priority also should be placed on avoiding capital investment above this level. 
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Table 8.2 Revised Formulation for Dewright Co. Preemptive 
Goal Programming Problem 










Priority Level Factor 









Employment level 


First priori nee 
Booey Capital investment - 


Second priority 


Based on these considerations, management has concluded that a preemptive 
goal programming approach now should be used, where the two goals just discussed 
should be the first-priority goals, and the other two original goals (exceeding 
$125,000,000 in long-run profit and avoiding a decrease in the employment level) 
should be the second-priority goals. Within the two priority levels, the relative penalty 
weights still should be the same as given in the last column of Table 8.1. This 
reformulation is summarized in Table 8.2. (The portions of Table 8.1 that are not 
included in Table 8.2 are unchanged.) 


SEQUENTIAL PROCEDURE: At the first stage of the sequential procedure, only the 
two first-priority goals are included in the linear programming model. Therefore, we 
can drop the common factor M for their penalty weights shown in Table 8.2. Pro- 


` ceeding just as for the nonpreemptive model if these were the only goals, the resulting 


linear programming model is 
Minimize Z = 2y} + 3y%, 


subject to 5x, + 3x, + 4x, — (y$ — yz) = 40 
5x, + 7x + 8x; — (ył — y3) = 55 
and x20, y$ =0, y;=ž0 (= 1,2,3;k = 2,3). 


(For ease of comparison with the nonpreemptive model with all four goals, we have 
kept the same subscripts on the auxiliary variables.) 

Using the simplex method (or inspection), an optimal solution for this linear 
programming model has y; = 0 and ył = 0, with Z `= 0 (so Z* = 0), because 
there are innumerable solutions for (x, x2, x3) that satisfy the relationships, 


5x, + 3x, + 4x, = 40 


5x, + 7x. + 8x3 = 55, 


as well as the nonnegativity constraints. Therefore, these two first-priority goals should 
be used as constraints hereafter. Using them as constraints will force y+ and y} to 
remain zero and thereby disappear from the model automatically. 

If we drop y$ and y$ but add the second-priority goals, the second-stage linear 
programming model becomes 


Minimize Z = Sy + 43, 


subject to 12x, + 9x, + 15x3 — (yi -YD = 125 
5x, + 3x5 + Ax, + Vo = 40 
5x; + 7x, + 8x; +y3Z = 55 


and 420, y} 20) yzž0 (j= 1,2,3;k= 1,2,3). 


Applying the simplex method to this model yields the unique optimal solution, x, = 
5 ky = 0, x3 = 3ł, yt = 0, SBE, y3 = 0, y3 = 0, with Z = 434. 

Because this solution is unique (or because there are no more priority levels), 
the procedure can now stop, with (x,, x2, x3) = (5, 0, 32) as the optimal solution for 
the overall problem. This solution fully achieves both first-priority goals, as well as 
one of the second-priority goals (no decrease in employment level), and it falls short 
by just 84 of the other second-priority goal (long-run profit = 125). 


STREAMLINED PROCEDURE: Using the streamlined procedure instead of the se- 
quential procedure, we work with just one linear programming model that includes 
all of the goals, as follows: 


Minimize Z = Sy] + 2My3 + 4y3 + 3My3, 
subject to 12x, + 9x, + 15x; — (yf — yD) = 125 





5x, + 3x, + 4x, (y3 y3) = 40 





5x, + 7x, + 8x3 — (y3 — y3) = 55 
and a = 0, yf 20, y; =O (j = 1,2, 3;k = 1, 2, 3). 


Because this model uses M to symbolize a huge positive number, the simplex method 


should be applied as described and illustrated throughout Sec. 4.6. Doing this naturally - 


yields the same unique optimal solution just obtained by the sequential procedure. 


8.3 Maximizing the Minimum Progress toward All Objectives 


Goal programming is one very useful tool for dealing with problems where several 
objectives must be considered simultaneously. However, it does require establishing 
goals for all of the objectives, and it is not always possible to do this in a meaningful 
way. In particular, some objectives are open-ended and one wants to make as much 
progress toward them as possible. To put it another way, for open-ended objectives 
there is no minimum standard (goal) such that you would be relatively indifferent 
about the amount of progress made beyond this standard. (For example, many man- 
agers consider the objective of maximizing profit to be of this type.) With open-ended 
objectives, you may also want to make progress on all of the objectives simultane- 
ously. In this case, it may be appropriate to maximize the minimum progress toward 
all objectives. 
To formulate this approach, suppose that there are K objectives, 


Z, = > cx; (Objective 1) 


Z, = >, px; (Objective 2) 


n 
Ze = > cx; (Objective K). 
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We wish to increase together the values of all of these individual objective functions. 
Therefore, the overall objective function for the model becomes 


Maximize Z = minimum {Z,, Z,,..., Zt, 
so an optimal solution for (x, X2, . .. , X„) is one that makes the smallest Z, (k = 
1,2,..., K) as large as possible. 


This overall objective function certainly does not fit into a linear programming 
format. Now let us see how the problem can be reformulated into this format. 

We begin by introducing an auxiliary variable z to represent the minimum value 
among the K objectives, 


z = minimum {Z,, Z,,..., Zg}. 
Introducing this auxiliary variable enables us to write the. overall objective function 
as 
Maximize Z = Z, 


which is a legitimate linear programming objective function (one variable with a 
coefficient of +1 and all other coefficients zero). 

The remaining question is how to incorporate the definition of z directly into a 
linear programming model. The definition implies that 


n 
Z= ps CiXy 


n 
Zi DA CiKXjs 


where these inequalities are legitimate linear programming constraints (after bringing 

all variables to the left-hand side for proper form). Furthermore, the. definition also 

implies that one or more of these constraints (the one with the smallest right-hand 

side) will hold with equality. Therefore, z is simply the largest quantity that satisfies 

all K of these constraints, which condition is already ensured by maximizing Z = z. 
Consequently, the equivalent linear programming model is 


Maximize Z = 


subject to Cx; ~ 229, fork =1,2,...,K 


n 
j=l 


x; = 0, forj = 1,2,...,n, 


and any other linear programming constraints in the original model. 


If it is clear that z will turn out to be nonnegative, a nonnegativity constraint can be 
included in the model for this variable as well. 


If the Z, are not measured in common units, they should be multiplied by the 
appropriate constants to convert them to a common unit of measurement. 

When the objectives are to be minimized rather than maximized, the overall 
objective function for the original model would change to 


Minimize Z = maximum {Z,, Z,,..., Zt. 
The equivalent linear programming model then is 


Minimize Z >= Z, 


subject to > Curt) z=0, fork = 1,2,...,K 
j=l 
x, = 0, forj = 1,2,...,a, 
and any other linear programming constraints in the original model. 
Prototype Example 


An international relief agency, the FOOD AND AGRICULTURE ORGANIZATION, 
is sending agricultural experts to two underdeveloped countries whose greatest need 
is to increase their food production by improving their agricultural techniques. There- 
fore, the experts will be used to develop pilot projects and training programs to 
demonstrate and teach these techniques. However, the number of such projects that 
can be undertaken is restricted by the limited availability of three required resources: 
equipment, experts, and money. The question is how many projects should be under- 
taken in each of the countries in order to make the best possible use of the resources. 

It has been estimated that each full project undertaken in country 1 eventually 
would increase the food production in this country sufficiently to feed 2,000 additional 
people. The corresponding estimate for country 2 is for an increase that would feed 
an additional 3,000 people. The two countries differ in the mix of resources needed 
for projects. These data are summarized in Table 8.3. It is feasible to consider projects 
at fractional levels as well as whole projects. We assume that fractions of projects 
will affect the data of Table 8.3 proportionally. 

Because both countries are in desperate need, the Food and Agriculture Or- 
ganization is determined to increase the food production in both countries as much as 
possible. Therefore, it has chosen the overall objective of maximizing the minimum 
increase in food production in the two countries. 


Table 8.3 Data for Food and Agriculture Organization Problem 


Amount Used Per Project Amount 





Resource Country 1 Country 2 | Available 
Equipment 20 
Experts 10 
Money 300 (thousands of dollars) 





People fed 
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FORMULATION: The decision variables, x, and x,, are the number of projects to be 
undertaken in countries 1 and 2, respectively. There are two objectives in this case— 
to increase the food production in country 1 and to increase the food. production in 
country 2. Their objective functions are 


Z, = 2,000x, (Objective 1) 
Z, = 3,000x, (Objective 2). 
Therefore, using Table 8.3 to construct the constraints, the overall model is 


Maximize Z = minimum {Z,, Z,} 


minimum {2,000x,, 3,000x,}, 
subject to 5x, = 20 
xX, + 2x, 10 
60x, + 20x, = 300 
and x, 20, n=O. 


To reformulate this model to fit the linear programming format, we introduce 
an auxiliary variable z, defined as 


z = minimum {Z,, Z,} = minimum {2,000x,, 3,000x,}, 


so that z represents the minimum increase in food production in the two countries. 
For example, if (x; x2) = (4, 2), so that Z, = 8,000 and Z, = 6,000, then z = 
6,000 as the minimum of the two quantities. Following the approach described earlier 
in this section to incorporate the definition of z directly into the model, the resulting 
equivalent linear programming model is 


Maximize Z= z, 
subject to 2,000x, -—-z2 0 
3,000x, -z= 0 
5x = 20 
x, + 2X4 = 10 
60x, + 20x, = 300 
and x, = 0, x, = 0, z20. 


Applying the simplex method (which does not differentiate between decision 
variables and auxiliary variables) yields the optimal solution, 


Ia 


x = ff, so Z= 8,182 
v=}, so Z, = 8,182 
z = 8,182. 


Consequently, an additional 8,182 people will be fed in each of the two countries. 


8.4 Some Formulation Examples 


We now present two examples that illustrate the kinds of challenging formulation 
problems that frequently are encountered in real applications of linear programming. 


Reclaiming Solid Wastes 


The SAVE-IT COMPANY operates a reclamation center that collects four types of 
solid waste materials and then treats them so that they can be amalgamated into a 
salable product. Three different grades of this product can be made, depending upon 
the mix of the materials used. Although there is some flexibility in the mix for each 
grade, quality standards do specify a minimum or maximum percentage (by weight) 
of certain materials allowed in that product grade. These specifications are given in 
Table 8.4 along with the cost of amalgamation and the selling price for each grade. 

The reclamation center collects its solid waste materials from some regular 
sources and so is normally able to maintain a steady production rate for treating these 
materials. Table 8.5 gives the quantities available for collection and treatment each 
week, as well as the cost of treatment, for each type of material. 

The problem facing the company is to determine just how much of each product 
grade to produce and the exact mix of materials to be used for each grade so as to 
maximize the total weekly profit (total sales income minus the total costs of both 
amalgamation and treatment). 


FORMULATION: Before attempting to construct a linear programming model, we 
must give careful consideration to the proper definition of the decision variables. 
Although this definition is often obvious, it sometimes becomes the crux of the entire 
formulation. After clearly identifying what information is really desired and the most 
convenient form for conveying this information by means of decision variables, we 


Table 8.4 Product Data for Save-It Co. 


Amalgamation 
Specification Cost ($) Per Pound 


Not more than 30% of material 1 










Selling Price ($) 
Per Pound 





A Not less than 40% of material 2 8.50 
Not more than 50% of material 3 
B Not more than 50% of material 1 7.00 





Not less than 10% of material 2 


Not more than 70% of material 1 


Table 8.5 Solid Waste Materials Data 


for Save-It Co. 


Pounds/Week 
Available 













Treatment Cost 


Material ($) Per Pound 
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can develop the objective function and the constraints on the values of these decision 
variables. 

In this particular problem, the decisions to be made are well defined, but the 
appropriate means of conveying this information may require some thought. (Try it 
and see if you first obtain the following inappropriate choice of decision variables.) 

Because one set of decisions concerns the amount of each product grade to be 
produced, it would seem natural to define one set of decision variables accordingly. 
Proceeding tentatively along this line, define 


y; = number of pounds of product grade i produced per week (i = A, B, C). 


The mixture of each grade is identified by the proportion of each material in the 
product. This identification would suggest defining the other set of decision variables 
as 


Z; = proportion of material j in product grade i (i = A, B, ŒG j= 1,2, 3, 4. 


However, Table 8.5 gives both the treatment cost and the availability of the materials 
by quantity (pounds) rather than proportion, so it is this quantity information that 
needs to be recorded in the objective function and in some of the constraints. For 
material j (j = 1, 2, 3, 4), 


Quantity of material j used = Zaa + ZR + Zoe 


Unfortunately, this expression is not a linear function because it involves products of 
variables. Therefore, a linear programming model cannot be constructed with these 
decision variables. 

Fortunately, there is another way of defining the decision variables that will fit 
the linear programming format. (Do you see how to do it?) It is accomplished by 
merely replacing each product of the old decision variables by a single variable! In 
other words, define 


Xj = Lyi (fori = A, B, CG j = 1,2,3,4 


ij 
= number of pounds of material j allocated to product grade i per week, 


and then let the x, be the decision variables. The total amount of product grade i 
produced per week is then x + xp + x3 + X,4. The proportion of material j in 
product grade i is x;/(x, + Xa + Xj; + x4). Therefore, this choice of decision 
variables conveys all the necessary information and proves to be well suited to the 
construction of the following linear programming model. (Note particularly how the 
mixture constraints involving the nonlinear proportion function are written in a linear 
form.) 
The total profit Z is given by 


Z = 5Sa + X4n + X43 + Xaq) + ASG + tp + X—3 + X pa) 
+ 3.5(xc + Xe + xe + xc) — Bay + Xg + xa) 
sa 6X49 + Xp + Xen) > 443 + X73 + xX) =a S(x 44 + X pa + Xa). 
Thus, after combining common terms, the model becomes 


Maximize Z = 2.5x4; — O.5X 49 + 15x43 + 0.5x44 + 1.5x—, — 1.5xp2 + 0.5x p3 


TER 0.5Xp4 + 0.5x¢ S 2.5xc2 4 0.5x¢3 Giz 1.5xc4 


subject to the following constraints: 


1. Availability of materials 


2. Mixture specifications: 


Xa) + Xp, + Xe, S 3,000 


Xa2 + Xpy + Xc S 2,000 


X43 + X33 oi Xo3 = 4,000 


X44 + Xpq + Xcq = 1,000. 


xai S O.3(% 4, + xaz + X43 + Xa4) 


Xa 2 O44, + Xa + X43 + Xag) 


xas = OSG q, + xa + X43 + Xy) 


Xp S 0.5(%p_, + Xp. + Xg3 + Xg4) 


Xg = 0.1(xp; + Xg + Xg + Xp). 


xo SOT (XG 


and 
3. Nonnegativity: 


x; = 0, 


+ xæ + xe + X ey). 


fori = A, B, C;j = 1,2,3,4. 


This formulation completes the model, except that the constraints for the mixture 
specifications need to be rewritten in the proper form for a linear programming model 
by bringing all variables to the left-hand side and combining terms, as follows: 


2. Mixture specifications: 
0.7%) 


—0.4x4) 
—0.5x4, 


0.5x,, 


—0.1x,, 


0.3x¢, 


+ 


+ 


0.3x49 — 0.3X43 
0.6xX4, — 0.4x43 
0.5x4, + 0.5x43 
0.5xg. — 0.5xz3 


0.9x_) — 0.1xg3 


— 0.7x¢ — 0.7xX¢3 


Production and Employment Scheduling’ 


The BOOMBUST COMPANY faces an unstable sales market and so must frequently 
make adjustments of some kind to compensate for predicted changes in the level of 
sales. When sales are increasing, these adjustments take the form of increasing the 


— 0.3x44 = 0 
— 0.4x4, 2 0 
— 0.5x44 S 0. 
— 0.5x,, = 0 
— 0. lxp4 = 0. 


— 0.7x¢4 = 0. 


1 This example is based on a model that was first developed by Fred Hanssmann and Sidney W. Hess in 
“A Linear Programming Approach to Production and Employment Scheduling,’’ Management Technology, 


1:46-52. 1960. 
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work force (hiring), having the existing work force work overtime, or using up existing 
(or future) inventories. Similarly, when sales are dropping, the company decreases its 
work force (lays people off), underutilizes its current work force, or builds up inven- 
tories. All these alternatives are costly in some way, especially when they are used 
to extremes. Consequently, the company often uses some combination of these pos- 
sible adjustments. However, it is very difficult to determine just which combination 
is least expensive, particularly when a series of adjustments is being planned to meet 
a series of predicted changes in sales. 

Therefore, management has asked the OR Department to study this problem and 
develop a systematic procedure for production and employment scheduling that will 
minimize the total cost of meeting the projected sales. The procedure should provide 
a month-by-month schedule over the upcoming 12 months, for planning purposes, 
but then the procedure should be reapplied each month to update the schedule based 
on the latest sales forecasts. 

The Marketing Division provides updated forecasts each month on the total 
volume of projected sales for the company in each of the next 12 months. The deci- 
sions to be made are concerned with the total work-force level (number of employees) 
and the production rate to be scheduled for each of these 12 months. These decisions, 
in turn, determine the net inventory level (amount stored minus back-orders) for these 
months. These quantities are denoted as follows for month ¢ (t = 1, 2,..., 12): 


S, = sales forecast, 


= work-force level, 


oy 
No 


= production rate, 
I, = inventory level at the end of the month. 


Because the company produces more than one product, S,, P,, and J, each represents 
the total quantity aggregated over all the products, expressed in the common unit of 
dollar value. 

To relate W, and P,, it is estimated that 10 employees are required on the average 
to produce one unit of production per month without working overtime, so that 


W, = 10P, if the work force is fully utilized on regular time only, 
W, < 10P, if overtime is used, 
W, > 10P, if the work force is underutilized on regular time. 


Various kinds of costs need to be taken into account in the model. Using the 
notation introduced in Sec. 8.1 (+ and — as superscripts), we summarize these costs 
(in units of thousands of dollars) in Table 8.6. 


Table 8.6 Cost Data for Boombust Co. Problem 








Type of Cost Amount Origin of Costs 
Hiring cost 4(W, — W,_,)* | Training, reorganization 
Layoff cost (W, — W,_,)~ | Severance pay, reorganization, low morale 
Regular payroll SW, Wages, fringe benefits 


T(10P, ~ W,)* 
Inventory cost 217 
Shortage cost 3I; 


Overtime cost Premium wages 
Storage expenses, interest on capital tied up 


Customer dissatisfaction, lost future sales 





Note that each of the cost functions in Table 8.6 (except the regular payroll) is 
a nonlinear function of the quantity involved because the function has the form illus- 
trated in Fig. 8.1 (but with a zero slope on one side of the origin). Therefore, letting 
Z be the total cost over all 12 months, the ‘‘natural’’ formulation of the model is the 
nonlinear programming problem. 


12 
Minimize Z= >, {4(W, — W,_,)* + (W,— W,_))> + 5W, + 7(10P, — W)* 


t=1 


+ 217 + 377}, 


subject to L= 1,, + P, — 8S, 
for® = 192, % 2.5 124 
and W, = 0, P,=0 


where the initial inventory level J, and work-force level W, are given. 
Now let us see how the OR Department reformulated this problem to fit the 
linear programming format. 


FORMULATION: As in the preceding example, the key to achieving a linear pro- 
gramming formulation of the problem is the appropriate definition of the decision 
variables. In this case, finding the appropriate definition involves combining two 
formulation techniques that were initially presented in Sec. 8.1. First, we intro- 
duce auxiliary variables (x, and y,) to represent the quantities, (W, — W,_,) and 
(10P, — W,), so that 


x, = W,- W- y= 10P, —- W, 


fort = 1,2, ... , 12. Next, because each of these variables and the J, are variables 
with positive and negative components, we replace each of them by the difference of 
two new nonnegative auxiliary variables as per the following summary: 


x =x} — x], so xt = (W, -— W, D" 
x = (W, - W-D), 
X =y y so y7 = OE W)*, 
L =I} —T;, 
where x} = 0, x7 20, y? =0, y7 = 0, I} =0, I; = 0. 
The objective function then becomes a linear function, 


12 
Z= X {4x} + x7 + SW, + Tyt + Qt +37} 
t=1 


The set of constraints for the linear programming formulation can be constructed 
simply by using the constraints for the preceding nonlinear programming problem 
(after substituting J} — Ij for I, fort = 1,2, ... , 12), and then incorporating the 
definition of the other nonnegative auxiliary variables into the model by introducing 
additional equality constraints. Consequently, the complete linear programming 


285 


Formulating Linear 
Programming Models, 
Including Goal 


Programming 


286 


Linear Programming 


model! is 


12 
Minimize Z= > {4x} + x7 + 5W, + Tyt + 27+ + 377}, 


t=1 
subject to It — I =/77_,-17., +P, -S, 
xt =x =W,- W,_, fort = 1,2,..., 12, 
yt =y, = 102, We 
and 
W,=0, P, =0, 2020, x720, yh 20. a SG, I}2=0, I72=0 
@=1,2,..., 1D, 


except that the variables appearing in the right-hand sides of the functional constraints 
still need to be transferred to the left-hand side for proper form. (Also, in the first 
constraint, when t = 1, I$ — 1% should be replaced by the known constant J.) 


8.5 A Case Study—School Rezoning to Achieve 
Racial Balance” 


The CITY OF MIDDLETOWN has three high schools, two of them attended primarily 
by white students and the other attended primarily by black students. Therefore, the 
Middletown school board has decided to redesign the school attendance zones to 
reduce the racial isolation in these schools. The new zones will apply only to students 
entering high school in the future, so the goal is to achieve reasonable racial balance 
in 3 years without substantially increasing the distances that the students must travel 
to school. 

The school district superintendent has read some articles about how operations 
research has been used to greatly aid the comprehensive planning of efficient zoning 
designs. On her recommendation, the school board has hired a team of operations 
research consultants to conduct the study and make recommendations. 

The consultants begin by defining and gathering the relevant data. For this 
purpose they divide the city geographically into 10 tracts. Since the current junior 
high population represents the anticipated high school population in 3 years, they then 
determine the number of white students and black students now in junior high from 
each tract. The distance the students must travel to school is a fundamental consid- 
eration, so they also determine the distance (in miles) from the center of each tract 
to each school. All this information appears in Table 8.7, along with the maximum 
number of students that can be assigned to each school. 

The consultants next begin formulating a mathematical model for the problem. 
In this case (as for many practical problems), the objective is not too well defined. 


' Tt is possible to reformulate this model further to reduce the number of variables, but the number of 
functional constraints remains the same, so the resulting reduction in computational effort turns out to be 
minor. 

? Although this case study is a hypothetical one, it is similar to several actual studies. The theory is based 
primarily on a paper by L. B. Hickman and H. M. Taylor, ‘‘School Rezoning to Achieve Racial Balance: 
A Linear Programming Approach,” J. Socio-Econ. Planning Sci., 3:127-134, 1969-1970. 


Table 8.7 Data for Middletown Study 





Distance 


No. of | No. of 






























Tract | Whites | Blacks | School 1 | School 2 | School 3 
1 300 150 1.2 1.5 3.3 
2 400 0 2.6 4.0 $35: 
3 200 300 0.7 1.1 2.8 
4 0 500 1.8 1.3 2.0 
5 200 200 1.5 0.4 2.3 
6 100 350 2.0 0.6 1.7 
7 250 200 1.2 1.4 3.1 
8 300 200 3.5 2.3 1.2 
9 150 250 3.2 1:2 0.7 

10 350 100 3.8 1.8 1.0 
School capacity: 1,500 2,000 1,300 


Instead, only two basic considerations have been articulated (racial balance and dis- 
tance traveled to school), and the goal is to achieve a reasonable trade-off between 
them. A common approach in this kind of situation is to express one consideration in 
the objective function and the other in the constraints. Thus there is a choice between 
optimizing the racial balance subject to constraints on distance traveled or optimizing 
the distance traveled subject to constraints on racial balance. Because it is easier to 
express distance traveled in the objective function, and because it seems more rea- 
sonable to (eventually) set minimal standards on racial balance for the constraints, the 
consultants choose the latter alternative. 

However, the objective of ‘‘optimizing the distance traveled’’ needs to be stated 
more precisely. One possibility is to minimize the maximum distance that any student 
must travel, using the corresponding formulation technique presented in Sec. 8.3, but 
this objective might lead to many students having to travel the maximum distance. 
Another more convenient objective that may yield a better overall result is to minimize 
the sum of the distances traveled by all students. If this leads to a few unacceptable 
inequities, they can be eliminated during the sensitivity analysis phase by introducing 
constraints on distance traveled by groups of students having excessive distances in 
the original optimal solution. Therefore, the structure chosen for the model is to 
minimize total distance traveled subject to constraints on racial balance and any other 
required constraints. 

Ultimately, decisions must be made about which individual students to assign 
to the respective schools. However, these detailed decisions on how to draw the 
boundaries of the school attendance zones can be worked out after the broader deci- 
sions on how many students from each tract to assign to each school are made. 
Therefore, the decision variables chosen for the model are 


x; = number of students in tract į assigned 
to school j G@=1,2,...,10;7 = 1, 2, 3). 


Rather than breaking these variables down further into the number of white students 
and the number of black students to be assigned, the consultants made a simplifying 
assumption that the racial mixture in each tract will be maintained in the assignments 
to the respective schools. 
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The resulting formulation of the model using Table 8.7 is as follows: 
Minimize Z = 1.2x4, + 1.5x +--+ + 1.0493, 
subject to the following constraints: 
1. Tract assignment: 
Xa + x + x3 = 450 


Xa, + X32 + x3 = 400 


%10,1 + Xi0,2 + X10,3 = 450. 


2. School capacity: 


Xyy t Xy + 
X2 t Xy + 


Xy3 t X33 + 


++ + x19, = 1,500 
+++ + x19 = 2,000 


+++ + x19.3 = 1,300. 


3. Nonnegativity: 


Xj = 0, 


fori = 1,2,.... 10; J=1,2,3. 


and 
4. Racial balance: 
Still to be developed. 


The racial balance constraints need to specify that the fraction of students of a 
given race in a given school must fall within certain limits. After discussing the issue 
with the school board and noting that the entire student population is equally divided 
between whites and blacks, it is decided that the same limits should apply to all the 
schools and that these limits should be symmetric with respect to the races. Thus, for 
each school and either race, the fraction of students should fall within the limits 


% — 0 fraction = 3 + 9, 


so that 0 represents the maximum allowable deviation from an equal distribution of 
races in a school. 

However, the school board members do not wish to specify a value for 6 until 
they can see the consequences of their decision in terms of the distances the students 
must travel. (Remember that they want to. achieve a reasonable trade-off between 
these two considerations.) Therefore, the consultants conclude that they should use 
parametric programming (see Secs. 4.7, 6:7, and 9.3) to determine how the optimal 
solution changes over the entire range of possible values of 6 (0 = 0 < 3). 

To express the racial balance constraints mathematically, we must first express 
the fraction of students of each race in each school in terms of the decision variables. 
For example, 

Esx + Go0)x, +- + G80) 


Fraction of white students in school 1 = ———->-_"—,, 
Xu F Xay Ft + Xio 


where each coefficient in the numerator is simply the number of white students in that 
tract divided by the total number of students in that tract (see Table 8.7). Thus the 
lower limit constraint on this fraction is 





Ps xy, + axa +-++ + Oxo 
Xu F Xa tor + Xion 
where L=34- 9. 


Because constraints in this form require the use of less efficient nonlinear pro- 
gramming algorithms, the consultants next convert these constraints into an equivalent 
form that fits the linear programming format. This conversion is done by multiplying 
both sides by the denominator of the right-hand side to obtain 


L + xa Ft + X91) S Sx + xy tere 5X05 
and then subtracting this right-hand side from both sides to obtain 
(L — x, + (L De, tee +h - $10.1 = 0. 


This same approach is used to develop the lower limit constraints for all six 
fractions (one for each combination of race and school), which is summarized as 
follows: 


4. Racial balance: 


(L — xa + © - Dey te t+ - 10,1 = 0 
(L — ax + © — Oxy + +++ + © Exo = 0 
(E - xn + E- Dey tee +e x02 = 0 
(L — Px + L — Oxy t+ +L Bro 0 
(L — 3x3 + (L Ixa +--+ + © — Oxy 3 50 
(L — 3)xj, + (L ~ Oxya +> ++ Le x03 = 0. 


This approach also could be used to develop the corresponding upper limit 
constraints representing the requirement that each fraction = $ + 0. However, because 


Fraction of white students = 1 — fraction of black students, 


the preceding lower limit constraints on both types of fractions guarantee that the 
upper limit requirements are satisfied also. Therefore, no additional constraints are 
needed for the model. 

One flaw in this formulation is that the x,; (as well as the corresponding numbers 
of white students and black students from tract i assigned to school j) are allowed to 
take on noninteger values (the divisibility assumption of linear programming discussed 
in Sec. 3.3). However, considering the large numbers of students involved, the con- 
sultants feel that there will be no difficulty in adjusting a noninteger optimal solution 
to integer values during the subsequent analysis. They know from experience that a 
linear programming formulation has major computational advantages over an integer 
programming formulation, so this approximation seems well worthwhile. 

The stage now is set to begin the computational phase of the study. When L is 
sufficiently small (that is, 0 is sufficiently close to $), the racial balance constraints 


289 


Formulating Linear 
Programming Models, 
Including Goal 
Programming 


290 


Linear Programming 


Table 8.8 Cost and Requirements Table for 
Transportation Problem Formulation of Middletown 
Problem Without Racial Balance Constraints 


Distance Per Student 











Destination 
School School School 
1 2 3 Supply 
Tract 1 1.2 1.5 3.3 450 
Tract 2 2.6 4.0 55 400 
Tract 3 0.7 11 2.8 500 
Tract 4 1.8 1.3 2.0 500 
Tract 5 1.5 0.4 2.3 400 
Source Tract 6 2.0 0.6 1.7 450 
Tract 7 1.2 1.4 3.1 450 
Tract 8 3.5 a3 1.2 500 
Tract 9 3.2 1.2 0.7 400 
Tract 10 3.8 1.8 1.0 450 
Dummy 11(D) 0 0 0 300 
Demand 1,500 2,000 1,300 








have no effect and can be deleted. The consultants also note that the problem without 
these constraints can be formulated as a transportation problem (the special type of 
linear programming problem described in Sec. 7.1), as shown in Table 8.8. Therefore, 
rather than using the simplex method, they begin by applying the much more efficient 
transportation simplex method (see Sec. 7.2) to this formulation. The resulting optimal 
solution has basic variables x,, = 450, x,, = 400, x3, = 500, x4. = 500, x5. = 
400, X62 = 450, x = 150, x7. = 300, X23 = 500, X97 = 50, X93 = 350, X103 = 
450, x4;5 = 300, with Z = 4,965. (Notice that this solution already is an integer 
solution, which always occurs with transportation problems that have integer supplies 
and demands.) 

The next step is to determine when this solution also is optimal for the original 
model with the racial balance constraints included. This determination is made by 
checking how large L can be made before the solution violates any of the racial balance 
constraints, which turns out to be L = 0.285. Because the solution is feasible for this 
range of values of L, it must also be optimal for these values. 

Given this information, the consultants next use parametric programming to 
determine how the optimal solution changes as L is increased continuously to 3, 
beginning with the preceding solution at L = 0.285. The result is a continually 
changing optimal solution where each variable is expressed as a function of L. (This 
approach can be thought of as applying the sensitivity analysis procedure described 
in Sec. 6.6 on a continuing basis to determine the effect of introducing the racial 
balance constraints as needed and of changing the coefficients of the variables in these 
constraints. ) 

However, the consultants feel that the results in this form would be too complex 
to be considered effectively by the school board. Therefore, after a careful examination 
of the results, they select a relatively small number of interesting alternatives—L = 
0.285, 0.30, 0.35, 0.40— which represent a cross section of trade-offs between racial 
balance and distance traveled. These alternatives are analyzed in detail, and appro- 
priate refinements are made in the ‘‘optimal solution’’ obtained from the model. The 


Table 8.9 Summary of Results for Middletown Problem 








Avera: 
ge Percentage 



























Distance 
6 Traveled (Miles) 1.0-1.4 Miles 1.5-1.9 Miles | = 2 Miles 
0.215 1.103 53.3 8.9 
0.20 1.107 51.1 8.9 
0.15 1.125 41.1 8.9 


1.152 36.1 


consultants then present their basic data and conclusions for the four alternatives to 
the school board, as summarized in Table 8.9 (with 6 = 0.5 — L). 

After considerable deliberation, the school board members choose the 0 = 0.15 
alternative. However, they modify this alternative slightly to avoid reassigning a very 
small proportion of one tract to a new school. The resulting master plan allocates 
tracts 2, 3, and 7 to school 1, tracts 1, 4, 5, and 6 to school 2, and tracts 8 and 9 to 
school 3, with tract 10 split as follows: x49 = 50, Xio = 400. Because this plan 
yields © = 0.155, the school board officially announces the new policy: that either 
race should form at least one-third the student body of any high school. They then 
instruct the superintendent to have her staff implement this policy, using the master 
plan as a basis for detailed planning. 


8.6 Conclusions 
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This chapter has described and illustrated some particularly useful formulation tech- 
niques for building linear programming models. This material provides a good back- 
ground for you, but the best teacher in this area is experience! Our goal has been to 
provide you with a solid foundation for dealing with real problems and for continuing 
to learn the art of linear programming. 
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PROBLEMS 
1. Consider the following problem. 
Minimize Z = |x,| + 2Ix,], 
subject to xX, + B= 4 
-x + y= 6 
=y — 3x, = -22 
(no nonnegativity constraints). 
(a) Use the technique presented in Sec. 8.1 to formulate the linear programming model 
for this problem. 
(b) Use the simplex method to solve the model as formulated in part (a). 


(c) Solve the original problem graphically by considering each of the four quadrants 
separately. 


2. Consider the following problem. 
Minimize Z = fi) + fo), 
subject to 2x, + 3x, 29 
x, + 2x,=9 
—X% = 1 
(no nonnegativity constraints), where 


fi@,) = { 


3x, if x,=0 
xy if x, =0, 


_ | 4x if x20 
fia) = be if x, <0. 


(a) Use the technique presented in Sec. 8.1 to formulate the linear programming model 
for this problem. 

(b) Use the simplex method to solve the problem as formulated in part (a). 

(c) Solve the original problem graphically by considering each of the four quadrants 
separately. 


3. Consider the following problem. 
Minimize Z = |x, — 2x, + xl, 
subject to 2x, + 3x, + 4x; = 40 
Tx, + 5x, + 3x3 = 70 
and x, 20, x, = 0, x32 0. 


Use the technique presented in Sec. 8.1 to formulate the linear programming model for this 
problem. 


4. Consider the following problem. 


Minimize Z = f,3x, — 2x.) + fx. — 4x3), 
subject to 6x, + 7x, + 5x3 = 30 
8x, + 5x, + 9x3 = 40 
and x 20, x, = 0, x3 = 0, 
7 _ f 3@x,-2x,) if 3x, -2x,20 
where fiBxı = 2x2) = eee —2x,) if 3x, — 2x,<0, 


ava) 4Bn- 4) if 
POr = 4x) e —4x,) if 


Use the technique presented in Sec. 8.1 to formulate the linear programming model for this 
problem. 


3x, — 4x, = 0 
3x, — 4x, = 0. 


5. The Research and Development Division of a certain company has developed three 
new products. The problem is to decide which mix of these products should be produced. 
Management wants primary consideration given to three factors: long-run profit, stability in the 
work force, and achieving an increase in the company’s earnings next year. In particular, using 
the units given in the following table, they want to 

Maximize Z = 2P ~ 5C — 3D, 
where P = total (discounted) profit over the life of the new products, 
C = change (in either direction) in the current level of employment, 

D = decrease (if any) in next year’s earnings from the current year’s level. 

The amount of any increase in earnings does not enter into Z, because management is concerned 
primarily with just achieving some increase to keep the stockholders happy. (It has mixed 
feelings about a large increase that then would be difficult to surpass in subsequent years.) 

The impact of each of the new products (per unit rate of production) on each of these 
factors is shown in the following table: 


Unit Contribution 


Product 








Factor Goal (Units) 


None (millions of dollars) 
=50 (hundreds of employees) 
275 (millions of dollars) 





Long-run profit 
Employment level 
Earnings next year 






Except for certain additional constraints not described here, use the goal programming 
technique to formulate the linear programming model for this problem. 


6. Reconsider the Middletown case study presented in Sec. 8.5. Suppose that the ob- 
jective is changed to minimize racial imbalance subject to a constraint on distance traveled and 
other necessary constraints. Racial imbalance is defined as the sum (over the three high schools) 
of the absolute difference between the number of white students and the number of black students 
at each high school. The constraint on distance traveled is that the average distance traveled 
by students to school must not exceed 1.15 miles. 

Describe how this problem fits into the framework of nonpreemptive goal programming 
by identifying the goals involved, and then use the goal programming technique to formulate 
the new linear programming model. 
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7.* Consider a preemptive goal programming problem with three priority levels, just 
one goal for each priority level, and just two activities to contribute toward these goals, as 
summarized in the following table: 


Unit Contribution 





Activity 
Priority Level 1 2 | Goal 
First priority 2 | #20 
Second priority 1 | =15 
Third priority 1 | 240 





(a) Use the goal programming technique to formulate one complete linear programming 
model for this problem. 

(b) Construct the inital simplex tableau for applying the streamlined procedure. Identify 
the initial basic feasible solution and the initial entering basic variable, but do not 
proceed further. 

(c) Starting from (b), use the streamlined procedure to solve the problem. 

(d) Use the logic of preemptive goal programming to solve the problem graphically by 
focusing on just the two decision variables. Explain the logic used. 

(e) Use the sequential procedure to solve this problem. After using the goal program- 
ming technique to formulate the linear programmiùg model (including auxiliary 
variables) at each stage, solve the model graphically by focusing on just the two 
decision variables. Identify all optimal solutions obtained for each stage. 


8. Redo Prob. 7 with the following revised table: 


Unit Contribution 






Priority Level 






First priority 20 
Second priority 230 
Third priority 


9. A certain developing country has 15,000,000 acres of publicly controlled agricultural 
land in active use. Its government currently is planning a way to divide this land among three 
basic crops (labeled 1, 2, and 3) next year. A certain percentage of each of these crops is 
exported in order to obtain badly needed foreign capital (dollars), and the rest of each of these 
crops is used to feed the populace. Raising these crops also provides employment for a signifi- 
cant proportion of the population. Therefore, the main factors to be considered in allocating 
the land to these crops are (1) the amount of foreign capital generated, (2) the number of citizens 
fed, and (3) the number of citizens employed in raising these crops. The following table shows 
how much each 1,000 acres of each crop contributes toward these factors, and the last column 
gives the goal established by the government for each of these factors. 


Contribution Per 1,000 Acres 





Crop 
Factor 1 2 3 Goal 
Foreign capital $3,000 $5,000 $4,000 2$70,000,000 
Citizens fed 150 75 100 = 1,750,000 
Citizens employed 10 15 12 = 200,000 





(a) In evaluating the relative seriousness of not achieving these goals, the government 
has concluded that the following deviations from the goals should be considered 
equally undesirable: (1) each $100 under the foreign capital goal, (2) each person 
under the citizens fed goal, and (3) each deviation of one (in either direction) from 
the citizens employed goal. Use the goal programming technique to formulate the 
linear programming model for this problem. 

(b) Now suppose that the government concludes that the importance of the various goals 
differs greatly so that a preemptive goal programming approach should be used. In 
particular, the first-priority goal is citizens fed = 1,750,000, the second-priority goal 
is foreign capital = $70,000,000, and the third-priority goal is citizens employed = 
200,000. Use the goal programming technique to formulate one complete linear 
programming model for this problem. 

(c) Use the streamlined procedure to solve the problem as formulated in part (b). 

(d) Use the sequential procedure to solve the problem as presented in part (b). 


10. Reconsider Prob. 9. Suppose now that the third-priority goal actually is a lower 
one-sided goal (desire = 200,000 citizens employed). The government feels that it is critical 
to come at least close to satisfying all of these goals. Therefore, the decision has been made 
to reformulate the problem to adopt the overriding objective of maximizing the minimum 
progress (on a percentage basis) toward all these goals. 

Use the technique presented in Sec. 8.3 to formulate the linear programming model for 
this new problem. 


11. One of the most important problems in the field of statistics is the linear regression 
problem. Roughly speaking, this problem involves fitting a straight line to statistical data 
represented by points—(x,, y,), Œz; Yo), . - - » Œn Yn)—on a graph. If we denote the line by 
y = a + bx, the objective is to choose the constants a and b to provide the ‘‘best’’ fit according 
to some criterion. The criterion usually used is the method of least squares, but there are other 
interesting criteria where linear programming can be used to solve for the optimal values of a 
and b. 

For each of the following criteria, formulate the linear programming model for this 
problem. 

(a) Minimize the sum of the absolute deviations of the data from the line; that is, 


Minimize > |y; — (a + bx). 
i=1 


(Hint: Note that this problem can be viewed as a nonpreemptive goal programming 
problem where each data point represents a ‘‘goal’’ for the regression line.) 
(b) Minimize the maximum absolute deviation of the data from the line; that is, 
Minimize max |y; — (a + bx,)j. 


f=1,2,...,27 


12. Reconsider the Middletown case study described in Sec. 8.5. Suppose that the 
objective is changed from minimizing the sum of the distances traveled by all students to 
minimizing the maximum over the tracts of the total distance traveled by the students in each 
respective tract, subject to the same constraints (including racial balance constraints) as before. 
Formulate the new linear programming model for this problem. 


13. Reconsider the Middletown case study described in Sec. 8.5. Suppose that bussing 
must be provided for all students traveling more than 1.5 miles (but to no others), and that the 
school board has adopted the objective of minimizing the total cost of this bussing, subject to 
the same constraints (including racial balance constraints) as before. Assuming that this cost is 
proportional to the sum of the distance traveled by all bussed students, formulate the new 
objective function for this problem. 


295 


Formulating Linear 
Programming Models, 
Including Goal 
Programming 


296 


Linear Programming 


14. The school board of a certain city has- made the decision to close one of its middle 
schools (sixth, seventh, and eighth grades) at the end of this school year and reassign all of 
next year’s middle school students to the three remaining middle schools. The school district 
provides bussing for all middle school students who must travel more than approximately a 
mile, so the school board wants a plan for reassigning the students that will minimize the total 
bussing cost. The cost per student of bussing from each of the six residential areas of the city 
to each of the schools is shown in the following table (along with other basic data for next 
year), where 0 indicates that bussing is not needed and a dash indicates an infeasible assignment. 





























No. of % 6th | mh | % 8th Bussing Cost Per Student 
Area Students | Grade Grade i School -1 lea School 2 School 3 
1 450 32 38 30 3 0 7 
2 600 37 28 35 0 4 5 
3 550 31 32 37 6 3 2 
4 350 28 39 33 2 5 — 
5 500 39 34 27 0 — 4 
6 450 33 29 38 5 3 0 
School capacity: 900 1,100 1,000 


The school board also has imposed the restriction that each grade must constitute between 
30 and 35 percent of each school’s population. The above table shows the percentage of each 
area’s middle school population for next year that falls into each of the three grades. The school 
attendance zone boundaries can be. drawn so as to split any given area among more than one 
school, but assume that the percentages shown in the table will continue to hold for any partial 
assignment of an area to a school. 

Formulate a linear programming model for determining how many students should be 
assigned from each area to each school. 


15. A company desires to blend a new alloy of 40 percent tin, 35 percent zinc, and 
25 percent lead from several available alloys having the following properties: 














Percentage tin 
Percentage zinc 
Percentage lead 


Cost ($/Ib) 


10 15 45 SO 40 
30 60 10 30 10 


The objective is to determine the proportions of these alloys that: should be blended to produce 
the new alloy at a minimum cost. Formulate the linear programming model for this problem. 


16. At the beginning of the fall semester, the director of the computer facility of a 
certain university is confronted with the problem of assigning different working hours to her 
operators. Because all the operators are currently enrolled in the university, her main concern 
is to make certain that the operators’ working times are not so excessive that they would interfere 
with study times. 

There are six operators (four men and two women). They all have different wage rates 
because of differences in their experience with computers and in their programming ability. 
The following table shows their wage rates, along with the maximum number of hours that 
each can work each day. 


Maximum Hours of Availability 


Operators Wage Rate | Mon. Tue. Wed. Thurs. Fri. 








K. C. $6.00/hour 6 0 6 0 6 
D.H $6.10/hour 0 6 0 6 0 
H. B $5.90/hour 4 8 4 0 4 
s.c $5.80/hour 5 5 5 0 5 
K.S $6.80/hour 3 0 3 8 0 
N.K $7.30/hour 0 0 0 6 2 


Because of a tight budget, the director has to minimize cost. Her decision is that the 
operators with the highest wage rates should work the least possible number of hours, except 
that this number should not be so low as to impair his or her knowledge of the operation. This 
level is set arbitrarily at 8 hours per week for the male operators and 7 hours per week for the 
female operators (K. S., N. K.). 

The computer facility is to be open for operation from 8 A.M. to 10 p.m. Monday through 
Friday with exactly one operator on duty during these hours. On Saturdays and Sundays, the 
computer is to be operated by other staff. 

Formulate a linear programming model so that the director can determine the number of 
hours she should assign to each operator on each day. 


17. A lumber company has three sources of wood and five markets to be supplied. The 
annual availability of wood at sources 1, 2, and 3 is 10, 20, and 15 million board feet, 
respectively. The amount that can be sold annually at markets 1, 2, 3, 4, and 5 is 7, 12, 9, 
10, and 8 million board feet, respectively. 

In the past the company has shipped the wood by train. However, because shipping costs 
have been increasing, the alternative of using ships to make some of the deliveries is being 
investigated. This alternative would require the company to invest in some ships. Except for 
these investment costs, the shipping costs in thousands of dollars per million board feet by rail 
and by water (when feasible) would be the following for each route: 


Unit Cost by Rail Unit Cost by Ship 






Market 
3 4 
61 72 45 55 


69 78 60 49 56 
59 66 63 6l 


Market 
3 4 


38 24 — 35 
36 4&3 2 24 3l 
33 36 32 26 


































The capital investment (in thousands of dollars) in ships required for each million board feet 
to be transported annually by ship along each route is given as follows: 


Investment for Ships 





Market 
Source 1 2 3 4 5 
1 275 303 238 — 285 
2 293 318 270 250 265 
3 — 283 275 268 240 





Considering the expected useful life of the ships and the time value of money, the equivalent 
uniform annual cost of these investments is one-tenth the amount given in the table. The 
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company is able to raise only $6,750,000 to invest in ships. The objective is to determine the 
overall shipping plan that minimizes the total equivalent uniform annual cost while meeting 
this investment budget and the sales demand at the markets. Formulate the linear programming 
model for this problem. 


18. A company needs to lease warehouse storage space over the next 5 months. Just 
how much space will be required in each of these months is known. However, since these 
space requirements are quite different, it may be most economical to lease only the amount 
needed each month on a month-by-month basis. On the other hand, the additional cost for 
leasing space for additional months is much less than for the first month, so it may be less 
expensive to lease the maximum amount needed for the entire 5 months. Another option is the 
intermediate approach of changing the total amount of space leased (by adding a new lease 
and/or having an old lease expire) at least once but not every month. 

The space requirement (in thousands of square feet) and the leasing costs (in hundreds 
of dollars) for the various leasing periods are as follows: 


Required Leasing Cost Per 1,000 
Space Period (Months) Sq. Ft. Leased 
1 650 
2 1,000 
3 1,350 
4 1,600 
5 1,900 





The objective is to minimize the total leasing cost for meeting the space requirements. Formulate 
the linear programming model for this problem. 


19. A spaceship to take astronauts to Mars and back is being designed. This spaceship 
will have three compartments, each with its own independent life support system. The key 


-element in each of these life support systems is a small oxidizer unit that triggers a chemical 


process for producing oxygen. However, these units cannot be tested in advance, and only 
some of them succeed in triggering this chemical process. Therefore, it is important to have 
several backup units for each system. Because the requirements are different for the three 
compartments, the units needed for each one have somewhat different characteristics. A decision 
must now be made on the number of units to be provided for each compartment, taking into 
account design limitations on the total amount of space, weight, and cost that can be allocated 
to these units for the entire spaceship. The following table summarizes these limitations, as 
well as the characteristics of the individual units for each compartment: 


Space 
(Cu. In.) 
40 











Weight Cost Probability 
Compartment (Lb.) ($) of Failure 





15 40,000 
45,000 
35,000 


Tf all the units fail in just one or two of the compartments, the astronauts can occupy 
the remaining compartment(s) and continue their space voyage but with some loss in the amount 
of scientific information they can obtain. However, if all units fail in all three compartments, 
the astronauts can still return the spaceship safely, but the whole voyage must be completely 
aborted at great expense. Therefore, the objective is to minimize the probability of all units 
failing, subject to the preceding limitations and the further restriction that each compartment 


0.40 
0.20 











has a probability of no more than 0.05 that all its units fail. Formulate the linear programming 
model for this problem. (Hint: Use logarithms.) 


20. A large paper manufacturing company has 10 paper mills and a large number (say, 
1,000) of customers to be supplied. It uses three alternative types of machines and four types 
of raw materials to make five different types of paper. Therefore, the company needs to develop 
a detailed production-distribution plan on a monthly basis, with an objective of minimizing the 
total cost of producing and distributing the paper during the month. Specifically, it is necessary 
to determine jointly the amount of each type of paper to be made at each paper mill on each 
type of machine and the amount of each type of paper to be shipped from each paper mill to 
each customer. 

The relevant data can be expressed symbolically as 


D = number of units of paper type k demanded by customer j, 


Tem = Number of units of raw material m needed to produce one unit of paper type k on 
machine type J, 


Rim = number of units of raw material m available at paper mill i, 
Cj = number of capacity units of machine type / that will produce one unit of paper type k, 
C,, = number of capacity units of machine type / available at paper mill i, 
Pı = production cost for each unit of paper type k produced on machine type / at paper 
mill i, 
T;x = transportation cost for each unit of paper type k shipped from paper mill i to 
customer j. 


(a) Using these symbols, formulate the linear programming model for this problem. 
(b) Considering the special structure of this model, give your recommendation on how 
it should be solved. 


21. One measure of the quality of the water in a river is its dissolved oxygen (D.O.) 
concentration. This measure is of interest partly because certain minimum concentration levels 
of D.O. are necessary to permit fish and other aquatic animals to survive. A large portion of 
the waste released into streams is organic material. This material is a source of nutrients for 
many organisms found in streams. In the process of utilizing the organic material, the organisms 
withdraw the D.O. contained in the stream. Thus the larger the amount of these wastes, the 
larger the biochemical oxygen demand (B.O.D.). 

Consider the following river system consisting of two tributaries leading into the main 
stream. The daily flow rate of water at cities 1 to 4 is known to be f,, fo, fi + fo, fi + fo, 
respectively. Water at city 1 or city 2 requires 1 day to reach city 3, and water at city 3 requires 
1 day to reach city 4. Let D, be the known D.O. concentration of the water just above city i 
(i = 1, 2). Similarly, let B; be the known waste concentration of the water, measured by its 
B.O.D. concentration, just above city i (i = 1, 2). Wastes are discharged from cities 1 to 3 
into the stream in known amounts w,, Wz, w3 per day. (These are negligible in comparison 
with f. 1> f 2) 


f l 
fith 


f 2 

If the waste discharged from city i (i = 1, 2, 3) is untreated, its B.O.D. concentration 
would be U, However, if appropriate treatment processes are used, the B.O.D. concentra- 
tion can be lowered to any level between L; and U;. The cost of reducing the B.O.D. concen- 
tration from U; is c; per unit reduction. However, some treatment is necessary at some or all 
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of these cities to achieve at least a minimum standard S for the D.O: concentration at cities 3 
and 4. The problem is to choose the B.O.D. concentration of wastes discharged from cities 1 
to 3 that minimizes the cost of meeting this standard. 

The following biochemical model has been developed to help solve problems of this 
type. Suppose that river water has a daily flow rate of f and a B.O.D. concentration of b and 
then has waste discharged at a rate of w per day with a B.O.D. concentration of x. The effect 
immediately downstream is to raise the B.O.D. concentration of the water to 


_ bf +m w 
w= E w s+ (2)x 


There is no immediate effect on the D.O. concentration. However, both the D.O. concentration 
and the B.O.D. concentration would change gradually downstream. In particular, if uo and vo 
are, respectively, the current D.O. and B.O.D. concentrations of the water at a particular 
location on a river, and if no waste is added to this water, then the respective concentrations 
u, V, of D.O. and B.O.D. in this same water 1 day downstream become 


u, = a + Bug ~ Yo 


v, = Ô + EUo, 


where a, B, y, 5, € are positive constants reflecting the various underlying physical and 
biochemical processes. Formulate the linear programming model for this problem. 





Other Algorithms for 
Linear Programming 


The key to the extremely widespread use of linear programming is the availability of 
an exceptionally efficient algorithm—the simplex method—that will routinely solve 
the large-sized problems that typically arise in practice. However, the simplex method 
is only part of the arsenal of algorithms regularly used by linear programming prac- 
titioners. Chapter 7 described special types of linear programming problems for which 
streamlined versions of the simplex method are available (as illustrated by the trans- 
portation simplex method in Sec. 7.2). Section 4.8 mentioned that production com- 
puter codes adapt the simplex method to a more convenient matrix form presented in 
Sec. 5.2. Sections 4.7 and 6.6 pointed out how certain modifications or extensions of 
the simplex method are particularly useful for sensitivity analysis. Thus all these 
algorithms are variants of the simplex method as it was presented in Chap. 4. Con- 
sequently, they also are exceptionally efficient. 

This chapter focuses first on three particularly important algorithms based on 
the simplex method. In particular, the next three sections present the upper bound 
technique (a streamlined version of the simplex method for dealing with variables 
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having upper bounds), the dual simplex method (a modification particularly useful for 
sensitivity analysis), and parametric programming (an extension for systematic sen- 
sitivity analysis). 

Section 4.9 introduced an exciting new advancement in linear programming — 
the development of a powerful new type of algorithm that moves through the interior 
of the feasible region. We describe this interior-point approach further in Sec. 9.4. 


2. 1 The Upper Bound Technique 


At the end of Sec. 7.5 we discussed the fact that it is common for some or all of the 
individual x; variables to have upper bound constraints 


ZU: 
Xj; S Uj, 


where u, is a positive constant representing the maximum feasible value of x;. (The 
right-hand side of Table 7.35 shows the:resulting special structure of the functional 
constraints.) However, we also pointed out in Sec. 4.8 that the most important de- 
terminant of computation time for the simplex method is the number of functional 
constraints, whereas the number of nonnegativity constraints is relatively unimportant. 
Therefore, having a large number of upper bound constraints among the functional 
constraints greatly increases computational effort. 

The upper bound technique avoids this increased effort by removing the upper 
bound constraints from the functional constraints and treating them separately, essen- 
tially like nonnegativity constraints. Removing the upper bound constraints in this 
way causes no problems as long as none of the variables get increased over its upper 
bound. The only time the simplex method increases some of the variables is when 
the entering basic variable is increased to obtain a new basic feasible solution. There- 
fore, the upper bound technique simply applies the simplex method in the usual way 
to the remainder of the problem (i.e., without the upper bound constraints) but with 
the one additional restriction that each new basic feasible solution is required to satisfy 
the upper bound constraints in addition to the usual lower bound (nonnegativity) 
constraints. 

To implement this idea, note that a decision variable x; with an upper bound 
constraint (x; = u;) can always be replaced by 


a= Uy ~ Vp 


where y; would then be the decision variable. In other words, you have a choice 
between letting the decision variable be the amount above zero (x;) or the amount 
below u; (y; = u; — x;). (We shall refer to x; and y; as complementary decision 
variables.) Because 

05x54, 
it also follows that 

O0<y, <u, 
Thus at any point during the simplex method you can either 


1. Use x, where 0 S x; = u; 
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2. Replace x; by (u; — y;), where 0 S y; S u; ia as sai 
The upper bound technique uses the following rule to make this choice: 


Rule: Begin with choice 1. 

Whenever x; = 0, use choice 1, so x; is nonbasic. 

Whenever X; = uj, use choice 2, so y= 0 is nonbasic. 

Switch choices only when the other extreme value of x; is reached. 


Therefore, whenever a basic variable reaches its upper bound, you should switch 
choices and use its complementary decision variable as the new nonbasic variable (the 
leaving basic variable) for identifying the new basic feasible solution. Thus the one 
substantive modification being made in the simplex method is in the rule for selecting 
the leaving basic variable. 

Recail that the simplex method selects as the leaving basic variable the one that 
would be the first to become infeasible by going negative as the entering basic variable 
is increased. The modification now made is to select instead the variable that would 
be the first to become infeasible in any way, either by going negative or by going 
over the upper bound, as the entering basic variable is increased. (Notice that one 
possibility is that the entering basic variable may become infeasible first by going 
over its upper bound, so that its complementary decision variable becomes the leaving 
basic variable.) If the leaving basic variable reaches zero, then proceed as usual with 
the simplex method. However, if it reaches its upper bound instead, then switch 
choices and make its complementary decision variable the leaving basic variable. 

To illustrate, consider the problem: 


Maximize Z = 2x, + x + 2x3, 


subject to Ax, + x = 12 
—2x, + X3 = 4 
and O=x,= 4 


Osx, 5 15 
OSx,= 6. 


Thus all three variables have upper bound constraints (u, = 4, u, = 15, u, = 6). 

The two equality constraints are already in proper form from Gaussian elimi- 
nation for identifying the initial basic feasible solution (x; = 0, x, = 12, x, = 4), 
and none of the variables in this solution exceed its upper bound, so x, and x, can be 
used as the initial basic variables without introducing artificial variables. However, 
these variables then need to be eliminated algebraically from the objective function 
to obtain the initial Eq. (0), as follows: 


Z Ti 2x; S Xz S 2x3 a 0 
+ (4x, + x, = 12) 
+ 2(- 2x, + x, = 4) 
(0) Z — 2x, = 20. 
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Table 9.1 Equations and Calculations for Initial Leaving 
Basic Variable in Example for Upper Bound Technique 


Initial Set of Maximum Feasible 





Equations Value of x, 
(0) Z= 2x, = 20 1 x,=4 (since u, = 4) 
12 
(1) 4x, + x = 12 asa 
6-4 i 
(2) —2x, +x = 4 =~ = Le min 


(because u, = 6) 





To start the first iteration, this initial Eq. (0) indicates that the initial entering 
basic variable is x,. Since the upper bound constraints are not to be included, the 
entire initial set of equations and the corresponding calculations for selecting the 
leaving basic variables are those shown in Table 9.1. The second column shows how 
much the entering basic variable x, can be increased from zero before some basic 
variable (including x,)} becomes infeasible. The maximum value given next to Eq. (0) 
is just the upper bound constraint for x,. For Eq. (1), since the coefficient of x, is 
positive, increasing x, to 3 decreases the basic variable in this equation (x,) from 12 
to its lower bound of zero. For Eq. (2), since the coefficient of x, is negative, in- 
creasing x, to 1 increases the basic variable in this equation (x3) from 4 to its upper 
bound of 6. 

Because this last maximum value of x, is the smallest, x, provides the leaving 
basic variable. However, because x, reached its upper bound, replace x, by (6 — y3) 
so that y, = 0 becomes the new nonbasic variable for the next basic feasible solution 
and x, becomes the new basic variable in Eq. (2). This replacement leads to the 
following changes in this equation: 


Opts. ja oS 
> -2x +(6-y)= 4 
> -— 2x, - yy = -2 
> y+ $y = hL 


Therefore, after eliminating x, algebraically from the other equations, the second 
complete set of equations becomes 


(0) Z + ys = 22 
(1) X,—-2y3= 8 
(2) x, +ły= 1. 


The resulting basic feasible solution is x, = 1, x. = 8, y, = 0. By the optimality 
test, it also is an optimal solution, so x, = 1, x. = 8, x, = 6 — y, = 6 is the 
desired solution to the original problem. 





9.2 The Dual Simplex Method 





The dual simplex method can be thought of as the mirror image of the simplex method. 
This interpretation is best explained by referring to Tables 6.10 and 6.11 and Fig. 
6.1. The simplex method deals directly with suboptimal basic solutions and moves 
toward an optimal solution by striving to satisfy the optimality test. By contrast, the 
dual simplex method deals directly with superoptimal basic solutions and moves to- 
ward an optimal solution by striving to achieve feasibility. Furthermore, the dual 
simplex method deals with a problem as if the simplex method were being applied 
simultaneously to its dual problem. If we make their initial basic solutions comple- 
mentary, the two methods move in complete sequence, obtaining complementary basic 
solutions with each iteration. 

The dual simplex method is very useful in certain special types of situations. 
Ordinarily it is easier to find an initial basic feasible solution than an initial super- 
optimal basic solution. However, it is occasionally necessary to introduce many ar- 
tificial variables to construct an initial basic feasible solution artificially. In such cases 
it may be easier to begin with a superoptimal basic solution and use the dual simplex 
method. Furthermore, fewer iterations may be required when it is not necessary to 
drive many artificial variables to zero. 

As we mentioned several times in Chap. 6 as well as in Sec. 4.7, another 
important primary application of the dual simplex method is its use in conjunction 
with sensitivity analysis. Suppose that an optimal solution has been obtained by the 
simplex method but that it becomes necessary (or of interest for sensitivity analysis) 
to make minor changes in the model. If the formerly optimal basic solution is no 
longer feasible (but still satisfies the optimality test), you can immediately apply the 
dual simplex method by starting with this superoptimal basic solution. Applying the 
dual simplex method in this way usually leads to the new optimal solution much more 
quickly than solving the new problem from the beginning with the simplex method. 

The rules for the dual simplex method are very similar to those for the simplex 
method. In fact, once they are started, the only difference between them is in the 
criteria used for selecting the entering and the leaving basic variables and for stopping 
the algorithm. 

To start the dual simplex method, we must have all the coefficients in Eq. (0) 
nonnegative (so that the basic solution is superoptimal). The basic solutions will be 
infeasible (except for the last one) only because some of the variables are negative. 
The method continues to decrease the value of the objective function, always retaining 
nonnegative coefficients in Eq. (0), until all the variables are nonnegative. Such a 
basic solution is feasible (it satisfies all the equations) and is, therefore, optimal by 
the simplex method criterion of nonnegative coefficients in Eq. (0). 

The details of the dual simplex method are summarized below. 


Summary of Dual Simplex Method 


1. Initialization step: Introduce slack variables as needed to construct a set of 
equations describing the problem. Find a basic solution such that the coef- 
ficients in Eq. (0) are zero for basic variables and nonnegative for nonbasic 
variables. Go to the feasibility test. 

2. Iterative step: 

Part 1. Determine the leaving basic variable: Select the basic variable 
with the largest negative value. 
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Part 2. Determine the entering basic variable: Select the nonbasic vari- 
able whose coefficient in Eq. (0) reaches zero first as an increasing multiple 
of the equation containing the leaving basic variable is added to Eq. (0). 
This selection is made by checking the nonbasic variables with negative 
coefficients in that equation (the one containing the leaving basic variable) 
and selecting the one with the smallest ratio of the Eq. (0) coefficient to the 
absolute value of the coefficient in that equation. 

Part 3. Determine the new basic solution: Starting from the current set 
of equations, solve for the basic variables in terms of the nonbasic variables 
by Gaussian elimination (see Appendix 4). When we set the nonbasic vari- 
ables equal to zero, each basic variable (and Z) equals the new right-hand 
side of the one equation in which it appears (with a coefficient of +1). 

3. Feasibility test: Determine whether this solution is feasible (and therefore 
optimal): Check to see whether all the basic variables are nonnegative. If 
they are, then this solution is feasible, and therefore optimal, so stop. Other- 
wise, go to the iterative step. 


To fully understand the dual simplex method, you must realize that the method 
proceeds just as if the simplex method were being applied to the complementary basic 
solutions in the dual problem. (In fact, this interpretation was the motivation for 
constructing the method as it is.) Part 1, determining the leaving basic variable, is 
equivalent to determining the entering basic variable in the dual problem. The variable 
with the largest negative value corresponds to the largest negative coefficient in Eq. 
(0) of the dual problem (see Table 6.3). Part 2, determining the entering basic variable, 
is equivalent to determining the leaving basic variable in the dual problem. The 
coefficient in Eq. (0) that reaches zero first corresponds to the variable in the dual 
problem that reaches zero first. The two criteria for stopping the algorithm are also 
complementary. 

We shall now illustrate the dual simplex method by applying it to the dual 
problem for the Wyndor Glass Co. (see Table 6.1). Normally this method is applied 
directly to the problem of concern (a primal problem). However, we have chosen this 
problem because you have already seen the simplex method applied to its dual problem 
(namely, the primal problem!) in Table 4.8 so you can compare the two. To facilitate 
the comparison, we shall continue to denote the decision variables in the problem 
being solved by y; rather than x;. 

In maximization form, the problem to be solved is 


Maximize Z = —4y, — 12y, — 18y;, 


subject to yı + 3y, 23 
2yy + 2y3 25 
and Yı = 0, y2 20, y; 2 0. 


After the functional constraints are converted to = form and the slack variables are 
introduced, the initial set of equations is that shown for iteration 0 in Table 9.2. Notice 
that all the coefficients in Eq. (0) are nonnegative, so the solution is optimal if it is 
feasible. 


' Recall that the symmetry property in Secs. 6.1 and 6.4 points out that the dual of a dual problem is the 
original primal problem. 


Table 9.2 Dual Simplex Method Applied to Wyndor Glass Co. Dual Problem 









































Basic Eq. Coefficient of Right 
Tteration Variable No. Z | Yı Yo Y3 Ya Ys Side 
Z 0 1 4 12 18 0 0 
0 Ya 1 0 —1 0 -3 1 
Vs 2 o [o [=2 —2 0 
Z 0 1 4 0 6 0 
1 Yq 1 0 E 0 =3 1 
Ya 2 | 0 0 1 1 0 
Z 0 2 0 0 2 
2 ys 1 3 0 1 -3 
Yo 2 3 1 0 4 











The initial basic solution is y, = 0, ya = 0, y3 = 0, y4 = —3, ys, = -5, 
with Z = 0, which is not feasible because of the negative values. The leaving basic 
variable is y; (5 > 3), and the entering basic variable is y, (2 < +8), which leads 
to the second set of equations, labeled as iteration 1 in Table 9.2. The corresponding 
basic solution is y, = 0, y> = 3, ya = 0, y4 = —3, y; = 0, with Z = —30, which 
is not feasible. 

The next leaving basic variable is y4, and the entering basic variable is ys 
(È < į), which leads to the final set of equations in Table 9.2. The corresponding 
basic solution is y, = 0, y = 3, ys = 1, y4 = 0, ys = 0, with Z = —36, which 
is feasible and therefore optimal. 

Notice that the optimal solution for the dual of this problem! is x} = 2, 
xy = 6,x3 = 2, xf = 0, xf = 0, as was obtained in Table 4.8 by the simplex 
method. We suggest that you now trace through Tables 9.2 and 4.8 simultaneously 
and compare the complementary steps for the two mirror-image methods. 








9.3 Parametric Linear Programming 


At the end of Sec. 6.7 we described parametric linear programming and its use for 
conducting sensitivity analysis systematically by gradually changing various model 
parameters simultaneously. We shall now present the algorithmic procedure, first for 
the case where the c; parameters are being changed and then where the b; parameters 
are varied. 


Systematic Changes in the c, Parameters 


For the case where the c, parameters are being changed, the objective function of the 
ordinary linear programming model, 


n 
Z = > CiXj» 
j= 


' The complementary optimal basic solutions property presented in Sec. 6.3 indicates how to read the 
optimal solution for the dual problem from row 0 of the final simplex tableau for the primal problem. This 
same conclusion holds regardless of whether the simplex method or the dual simplex method is used to 
obtain the final tableau. 
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is replaced by 20) = >» (c; + @8)x;, 
j=l 


where the a; are given input constants representing the relative rates at which the 
coefficients are to be changed. Therefore, gradually increasing 0 from zero changes 
the coefficients at these relative rates. 

The values assigned to the a; May represent interesting simultaneous changes 
of the c; for systematic sensitivity analysis of the effect of increasing the magnitude 
of these changes. They may also be based on how the coefficients (e.g., unit profits) 
would change together with respect to some factor measured by 0. This factor might 
be uncontrollable, e.g., the state of the economy. However, it may also be under the 
control of the decision maker, e.g., the amount of personnel and equipment to shift 
from some of the activities to others. 

For any given value of 0, the optimal solution of the corresponding linear 
programming problem can be obtained by the simplex method. This solution may 
have been obtained already for the original problem where 6 = 0. However, the 
objective is to find the optimal solution of the modified linear programming problem 
[maximize Z(0) subject to the original. constraints] as a function of 0. Therefore, the 
solution procedure needs to be able to determine when and how the optimal solution 
changes (if it does) as 6 increases from zero to any specified positive number. 

Figure 9.1 illustrates how Z*(@), the objective function value for the optimal 
solution (given 0), changes as 0 increases. Z*(@) always has this piecewise linear and 
convex' form (see Prob. 24). The corresponding optimal solution changes (as 6 in- 
creases) just at the values of 0 where the slope of the Z*(@) function changes. Thus 
Fig. 9.1 depicts a problem where. three different solutions are optimal for different 
values of 0, one for 0 = 0 = @,, the second for 6, = 6 = 6,, and the third for 0 = 
6,. Because the value of each x; remains the same within each of these intervals for 
0, the value of Z*(@) varies with 6 only because the coefficients of the x, are changing 
as a linear function of 6. 

The solution procedure is based directly upon the sensitivity analysis procedure 
for investigating changes in the c; parameters (cases 2a and 3, Sec. 6.7). As described 
in the last subsection of Sec. 6.7, the only basic difference with parametric linear 
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0 0, 05 0 
Figure 9.1 Objective function value for an optimal solution as a function of 6 for parametric linear 
programming with systematic changes in the c; parameters. 


' See Appendix 1 for a definition and discussion of convex functions. 
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To illustrate, suppose that a, = 2 and a, = ~ 1 for the original Wyndor Glass Linear Programming 


Co. problem (see Sec. 3.1 and Table 4.8), so that 

Z(0) = (3 + 20)x, + (5 — Ox» 
Beginning with the final simplex tableau for 6 = O (Table 4.8), its Eq. (0), 
(0) Z + bx, + x5 = 36, 


would first have these changes from the original (0 = 0) coefficients added into it 
on the left-hand side: 


(0) Z — 20x, + Ox, + 3x, + x5 = 36. 
Because both x, and x, are basic variables [appearing in Eqs. (3) and (2), respectively], 
they both need to be eliminated algebraically from Eq. (0): 
Z — 20x, + Ox, + 3x, + x5 = 36 
+26 times Eq. (3) 
— 6@times Eq. (2) 
(0) Z+ & — 40x, + (1 + 30x; = 36 — 20. 





The optimality test says that the current basic feasible solution will remain 
optimal as long as these coefficients of the nonbasic variables remain nonnegative: 


32 -— 40 = 0, fros 0s=?, 
1 + 4620, for all 0 = 0. 


Therefore, after increasing 0 past 9 = Ž, x, would need to be the entering basic 
variable for another iteration of the simplex method to find the new optimal solution. 
Then 6 would be increased further until another coefficient goes negative, and so on 
until @ has been increased as far as desired. 


This entire procedure is now summarized, and the example is completed in 
Table 9.3. 


Summary of Parametric Programming Procedure for 
Systematic Changes in the c; Parameters 


Step 1: Solve the problem with 6 = 0 by the simplex method. 


Step 2: Use the sensitivity analysis procedure (cases 2a and 3, Sec. 6.7) to introduce 
the Ac; = a@,@ changes into Eq. (0). 


Step 3: Increase @ until one of the nonbasic variables has its coefficient in Eq. (0) go 
negative (or until @ has been increased as far as desired). 


Step 4: Use this variable as the entering basic variable for an iteration of the simplex 
method to find the new optimal solution. Return to step 3. 


Systematic Changes in the b, Parameters 


For the case where the b; parameters change systematically, the one modification made 
in the original linear programming model is that b; is replaced by (b; + a;@), for 
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Table 9.3 The c; Parametric Programming Procedure Applied to Wyndor Glass Co. Example 


Range Basic P Optimal 
of 6 Variable ; Solution 















































1 
= 3 
2 
0 4 
Z(8) 0 }1/0 -5+6 3+ 26 0 0 12+ 86 x»= 0 
x, = 0 
025 Xs 1 Jolo 2 0 1 12 x22 
Xs 2 10/0 2 5 1 6 Xs = 6 
x 3 |o]1 0 1 0 0 4 n=4 
i= 1,2,...,m, where the a; are given input constants. Thus the problem becomes 


Maximize Z(0) = >, CjXjs 
j=l 


n 
subject to > ajx; Sb; + að, fori = 1,2,...,m, 
j=l 


and x; = 0, forj = 1,2,..., 7. 


The goal is to identify the optimal solution as a function of 6. 

With this formulation, the corresponding objective function value Z*(0) always 
has the piecewise linear and concave! form shown in Fig. 9.2. (See Prob. 25.) The 
set of basic variables in the optimal solution still changes (as 0 increases) only where 
the slope of Z*(@) changes. However, in contrast to the preceding case, the values of 
these variables now change as a (linear) function of 6 between the slope changes. The 
reason is that increasing 0 changes the right-hand sides in the initial set of equations, 
which then causes changes in the right-hand sides in the final set of equations, i.e., 
in the values of the final set of basic variables. Figure 9.2 depicts a problem with 
three sets of basic variables that are optimal for different values of 6, one for 0 = 
0 = 6,, the second for 6, = 0 = @,, and the third for 6 = 0,. Within each of these 


l See Appendix 1 for a definition and discussion of concave functions. 


Z* (0) 








w 


0 9, 05 6 


Figure 9.2 Objective function value for an optimal solution as a function of @ for parametric linear 
programming with systematic changes in the b, parameters. 


intervals for 0, the value of Z*(@) varies with @ despite the fixed coefficients c; because 
the values of the x; are changing. 

The following solution procedure summary is very similar to that just presented 
for systematic changes in the c; parameters. The reason is that changing the b; is 
equivalent to changing the coefficients in the objective function of the dual model. 
Therefore, the procedure for the primal problem is exactly complementary to applying 
simultaneously the procedure for systematic changes in the c; parameters to the dual 
problem. Consequently, the dual simplex method (see Sec. 9.2) now would be used 
to obtain each new optimal solution, and the applicable sensitivity analysis case (see 
Sec. 6.7) now is case 1, but these differences are the only major differences. 


Summary of Parametric Programming Procedure for 
Systematic Changes in the b, Parameters 
Step 1: Solve the problem with 6 = 0 by the simplex method. 


Step 2: Use the sensitivity analysis procedure (case 1, Sec. 6.7) to introduce the 
Ab; = &;0 changes into the right-side column. 


Step 3: Increase 0 until one of the basic variables has its value in the right-side column 
go negative (or until 0 has been increased as far as desired). 


Step 4: Use this variable as the leaving basic variable for an iteration of the dual 
simplex method to find the new optimal solution. Return to step 3. 


To illustrate this procedure in a way that demonstrates its duality relationship 
with the procedure for systematic changes in the c; parameters, we shall now apply 
it to the dual problem for the Wyndor Glass Co. (see Table 6.1). In particular, suppose 


that a, = 2 and a, = —1 so that the functional constraints become 
yy + 3y, =3 + 280, or yy — 3y, = —3 — 26 
2y + 27,25 — 8, or —2y, ~ 2ygS -S + 8. 


Thus the dual of this problem is just the example considered in Table 9.3. 
This problem with 6 = O has already been solved in Table 9.2, so we begin 
with the final simplex tableau given there. Using the sensitivity analysis procedure 
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Table 9.4 The b; Parametric Programming Procedure Applied to Dual of Wyndor 
Glass Co. Example 


























































Range Basic | Eq. Coefficient of Right Optimal 
of 6 Variable | No. yy yo V3 Ya Vs Side Solution 
Z(6) 2] 0 0 2 6| -36+26 y,=%=ys =0 
9 1 1 3 +20 3 + 20 
OsOs- = 0 =e 0 ane 
257 >» 3 1 o3 a4 3 
i I l 9—70 9 — 70 
=al 1 0 = == = 
| ql 3 2? 6 |?” 6 
o 6 0o 4 [3] -27-50 y,=y,=y,=0 
9 1 5-6 5-6 
-<9<5 0 1 - = 
7 : ° z 2 ee 
3], —9 + 70 -9 +76 
-3 -1 z 
1 0 2] 3 yı 2 
0 12 6 4 0] -12- 860 y, = y, = yy = 0 
0 -2 -2 0 1| -5+9 y=-5+9 
1 0 3 -1 0 3 +20 y= 


for case 1, Sec. 6.7, we find that the entries in the right-side column of this tableau 
change to the values given below. 


- -3 — 20 

* — y*þ = ako 

yo = y*b ea s 36 + 29, 
1 20 

-= 0 1+ = 

b¥=s*=] 3 eee 5 
1 _1j[-s+ 6] |3_ Jey 
3 2 2 6 


Therefore, the two basic variables in this tableau, 


_3+20 and _9— 70 
y3 = 3 V 6 





remain nonnegative for 0 = 0 = #. Increasing 6 past 0. = 7 requires making y, a 
leaving basic variable for another iteration of the dual simplex method, and so on, 
as summarized in Table 9.4. 

We suggest that you now trace through Tables 9.3 and 9.4 simultaneously to 
note the duality relationship between the two procedures. 


9.4 An Interior-Point Algorithm 


In Sec. 4.9 we discussed a dramatic new development in linear programming, the 
invention by Narendra Karmarkar of AT&T Bell Laboratories of a powerful new 
algorithm for solving huge linear programming problems. We now introduce the nature 


of Karmarkar’s approach by describing a relatively elementary variant (the ‘‘affine’’ 
or ‘‘affine-scaling’’ variant) of his algorithm.! 

Throughout this section we shall focus on Karmarkar’s main ideas on an intuitive 
level while avoiding mathematical details. In particular, we shall bypass certain details 
that are needed for the full implementation of the algorithm (e.g., how to find an 
initial feasible trial solution) but are not central to a basic conceptual understanding. 
The ideas to be described can be summarized as follows: 


Concept 1: Shoot through the interior of the feasible region toward an optimal 
solution. 

Concept 2: Move in a direction that improves the objective function value at 
the fastest possible rate. 

Concept 3: Transform the feasible region to place the current trial solution 
near its center, thereby enabling a large improvement when implementing 
Concept 2. 


To illustrate these ideas throughout the section, we shall use the following 
example: 


Maximize Z = xX, + 2x, 
subject to xX, +, = 8 
and x, =0, x, 2 0. 


This problem is depicted graphically in Fig. 9.3, where the optimal solution is seen 
to be (x), x2) = (0, 8) with Z = 16. 


The Relevance of the Gradient for Concepts 1 and 2 


The algorithm begins with an initial trial solution that (like all subsequent trial solu- 
tions) lies in the interior of the feasible region. Thus, for the example, the solution 
must not lie on any of the three lines (x; = 0, x, = 0, x, + x, = 8) that form the 
boundary of this region in Fig. 9.3. We have arbitrarily chosen (x,, x2) = (2, 2) to 
be this initial trial solution. 

To begin implementing Concepts 1 and 2, note in Fig. 9.3 that the direction of 
movement from (2, 2) that increases Z at the fastest possible rate is perpendicular to 
(and toward) the objective function line, Z = 16 = x, + 2x. We have shown this 
direction by the arrow from (2, 2) to (3, 4). Using vector addition, 


(3, 4) = (2, 2) + (l, 2), 


where the vector (1, 2) is the gradient of the objective function. (We will discuss 
gradients further in Sec. 14.5 in the broader context of nonlinear programming, where 
algorithms similar to Karmarkar’s have long been used.) The components of (1, 2) 
are just the coefficients in the objective function. Thus, with one subsequent modifi- 


1 The basic approach for this variant actually was proposed in 1967 by a Russian mathematician, I. I. 
Dikin, and then rediscovered soon after the appearance of Karmarkar’s work by a number of researchers, 
including E. R. Barnes, T. M. Cavalier, and A. L. Soyster. Also see Vanderbei, Robert J., Marc S. 
Meketon, and Barry A. Freedman: ‘ʻA Modification of Karmarkar’s Linear Programming Algorithm,” 
Algorithmica, 1(4) (Special Issue on New Approaches to Linear Programming): 395-407, 1986. 


313 


Other Algorithms for 
Linear Programming 


314 
Linear Programming 








0 2 4 6 8 xy 
Figure 9.3 Example for the interior-point algorithm. 


cation, the gradient (1, 2) defines the ideal direction in which to move, where the 
question of the distance to move will be considered later. 

The algorithm actually operates on linear programming problems after they have 
been rewritten in augmented form. Letting x, be the slack variable for the functional 
constraint of the example, this form is 


Maximize Z =X, + 2X, 
subject to X, +X, + x3 = 8 
and x, 20, x Z 0, x, Z 0. 


Using matrix notation (slightly different from Chap. 5), the augmented form can be 
written in general as 


Maximize Z = cx, 


subject to Ax =b 
and x= 0, 
1 xy 0 
where c=1/2], x=]x,|, A=([1,1, 1], b= [8], 0=1]0 
0 Xs 0 


for the example. Note that c" = [1, 2, 0] now is the gradient of the objective function. 

The augmented form of the example is depicted graphically in Fig. 9.4. 
The feasible region now consists of the triangle with vertices (8, 0, 0), (0, 8, 0), 
(0, 0, 8). Points in the interior of this feasible region are those where x, > 0, 
xX, > 0, and x, > 0. Each of these three x; > 0 conditions has the effect of forcing 
(xı; X)) away from one of the three lines forming the boundary of the feasible region 
in Fig. 9.3. 


315 


Other Algorithms for 
Linear Programming 






(2, 2, 4) 
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0, 8, 0) optimal 


Figure 9.4 Example in augmented form for the interior-point algorithm. 


Using the Projected Gradient to Implement Concepts 1 and 2 


In augmented form, the initial trial solution for the example is (x,, X23, x3) = 
(2, 2, 4). Adding the gradient (1, 2, 0) leads to 


(3,4, 4) = (2,2,4) + (1, 2, 0). 


However, now there is a complication. The algorithm cannot move from (2, 2, 4) 
toward (3, 4, 4), because (3, 4, 4) is infeasible! When x, = 3 and x, = 4, then 
X3 = 8 — x, — x, = 1 instead of 4. The point (3, 4, 4) lies on the near side as you 
look down on the feasible triangle in Fig. 9.4. Therefore, to remain feasible, the 
algorithm (indirectly) projects the point (3, 4, 4) down onto the feasible triangle by 
dropping a line that is perpendicular to this triangle. This perpendicular line intersects 
the triangle at (2, 3, 3). Because 


(2, 3, 3) = (2, 2, 4) + (0, 1, —D), 


the projected gradient of the objective function (the gradient projected onto the 
feasible region) is (0, 1, — 1). It is this projected gradient that defines the direction 
of movement for the algorithm, as shown by the arrow in Fig. 9.4. 
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A formula is available for computing the projected gradient directly. Defining 
the projection matrix P as 


P =I — AYA AD! A, 


the projected gradient (in column form) is 


Thus, for the example, 





100 1 1! 
P={0 1 oļ-|ıll u 1 aft [1 1 Vj 
001 1 1 
100 1 
=/o 1 ol]-#ilm i vy 
001 1 
1 0 0 111 2 -4 -4 
=|0 1 of-41 1 1/=]-4 2 —4l, 
001 Pete 7 -4 -4 3 
$ -4 -3//1 0 
so e =]|-3 3 -3ļ|2]|= 1l. 
-34 -3 3110 -1 


Moving from (2, 2, 4) in the direction of the projected gradient (0, 1, —1) 
involves increasing a from zero in the formula, 


2 2 0 
x= |2 + 4ac, = 21+ 4a 1], 
4 4 —1 


where the coefficient 4 is used simply to give an upper bound of 1 for œ to maintain 
feasibility (all x; = 0). Thus œ measures the fraction used of the distance that could 
be moved before leaving thè feasible region. 

How large should «œ be made for moving to the next trial solution? Because the 
increase in Z is proportional to a, a value close to the upper bound of 1 is good for 
giving a relatively large step toward optimality on the current iteration. However, the 
problem with a value too close to 1 is that the next trial solution then is jammed 
against a constraint boundary, thereby making it difficult to take large improving steps 
during subsequent iterations. It is very helpful for trial solutions to be near the center 
of the feasible region (or at least the portion of the feasible region in the vicinity of 
an optimal solution), and not too close to any constraint boundary. With this in mind, 
Karmarkar has stated for his algorithm that a value as large as a = 0.25 should be 
“*safe.’’ In practice, much larger values (for example, a = 0.9) sometimes are used. 
For purposes of this example (and the problems at the end of the chapter); we have 
chosen a = 0.5. 


A Centering Scheme for Implementing Concept 3 


We now have just one more step to complete the description of the algorithm, namely, 
a special scheme for transforming the feasible region to place the current trial solution 
near its center. We have just described the benefit of having the trial solution near 


the center, but another important benefit of this centering scheme is that it keeps 
turning the direction of the projected gradient to point more nearly toward an optimal 
solution as the algorithm converges toward this solution. 

The basic idea of the centering scheme is straightforward—simply change the 
scale (units) for each of the variables so that the trial solution becomes equidistant 
from the constraint boundaries in the new coordinate system. (Karmarkar’s original 
algorithm uses a more sophisticated centering scheme.) 

For the example, there are three constraint boundaries in Fig. 9.3, each one 
corresponding to a zero value for one of the three variables of the problem in aug- 
mented form, namely, x, = 0, x, = 0, and x; = 0. In Fig. 9.4, see how these three 
constraint boundaries intersect the Ax = b (x, + x, + x, = 8) plane to form the 
boundary of the feasible region. The initial trial solution is (x1, x2, x3) = (2, 2, 4, 
so this solution is two units away from the x, = 0 and x, = 0 constraint boundaries 
and four units away from the x, = 0 constraint boundary, when using the units of 
the respective variables. However, whatever these units are in each case, they are 
quite arbitrary and can be changed as desired without changing the problem. There- 
fore, let us rescale the variables as follows: 

z X z X2 
X= 7, k= =, &% = = 
1 3 25 3 
in order to make the current trial solution of (x,, x2, x3) = (2, 2, 4) become 
(%, %, %) = (l, 1, 1). 
In these new coordinates, the problem becomes 
Maximize Z = 2%, + 4%, 
subject to 2k, + 2%, + 4%; = 8 
and ¥, = 0, k, = 0, x, = 0, 


as depicted graphically in Fig. 9.5. 











Figure 9.5 Example after rescaling for iteration 1. 
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Note that the trial solution (1, 1, 1) in Fig. 9.5 is equidistant from the three 
constraint boundaries, t, = 0, %, = 0, and. £, = 0. For each subsequent iteration as 
well, the problem is rescaled again to achieve this same property, so that the current 
trial solution always is (1, 1, 1) in the current coordinates: 


Summary and Illustration of the Algorithm 


Now let us summarize and illustrate the algorithm by going through the first iteration 
for the example, then giving a summary of the general procedure, and then applying 
this summary to a second iteration. 


ITERATION 1: Given the initial trial solution, (x,, x», x3) = (2, 2, 4), let D be the 
corresponding diagonal matrix, so that 


D = 


oon 
ON Oo 
A Oo 


The rescaled variables then are the components of 








1 x 
> 0 0 ZE 
2 2 
Xi 
1 x 
yg = D! — 0 — 0 X- = 22 
5 i 2 5 2 
3 
pat ae 
4 4 
In these new coordinates, A and c have become 
7 2 0 0 
A=AD=f[1l 1 1/0 2 OJ=[2 2 4], 
0 0 4 
2 0 OF 1 2 
€=De=]0 2 OF 2/= 14 
0 0 414;0 0 
Therefore, the projection matrix is 
P = I — AAA™)“'A 
100 2 oes 
=10 1 07 -—]2]/ [2 2 4]}2 [2 2 4 
00 1 4 4 
1 0 0 4 4 8 2 -4 —4 
=j|0 1 O]-wl4 4 8/=1-8 2 -4], 
0 0 1 8 8 16 -4 -3 3 
so that the projected gradient is 
§ -$ -8][2 
c = Pë = 5 é 41/4] = 3 |. 
-4 -+ jlo] |-2 


Define v as the absolute value of the negative component of c, having the largest 319 
absolute value, so that v = |—2| = 2 in this case. Consequently, in the current Other Algorithms for 
coordinates, the algorithm now moves from the current trial solution, (£, £>, 3) = Linear Programming 
(1, 1, 1), to the next trial solution, 


5 

7 a i 0.5 l p 

xk = + —¢, = 1 EJ =a, 
K 1 =2 4 


as shown in Fig. 9.5. (The definition of v has been chosen to make the smallest 
component of X equal to zero when œ = 1 in this equation for the next trial solution.) 
In the original coordinates, this solution is 


X 2 0 Oe 3 
lepela 2z Ollie s 
X3 0 0 444 2 


This completes the iteration, and this new solution will be used to start the next 
iteration. 
These steps can be summarized as follows for any iteration, 


Summary of the Interior-Point Algorithm 


Step 1; Given the current trial solution (x,, X3, ... , X,), set 
x, O O 0 
0 Xa 0 tot 0 


0 0 0>- x 
Step 2: Calculate A = AD and č = De. 


Step 3: Calculate P = I — A™(AA™)~! A and c, = Pë. 


Step 4: Identify the negative component of ¢, having the largest absolute value, and 
set v equal to this absolute value. Then calculate 


where a is a selected constant between 0 and 1 (e.g., a = 0.5). 


Step 5: Calculate x = DX as the trial solution for the next iteration (step 1). (If this 
trial solution is virtually unchanged from the preceding one, then the algorithm has 
virtually converged to an optimal solution, so stop.) 


Now let us apply this summary to iteration 2 for the example. 


320 ‘ITERATION 2: 
Step 1: Given the current trial solution, (x,, x, x3) = (3, $, 2), set 


Linear Programming 
5 0 0 
D=]0 4 0.. 
0 0 2 
(Note that the rescaled variables are 
FA 2 0 Ollx, 2x 
H)=D'x=]0 F ojx] =| 2x], 
x, 0 0 4 X3 3X3 
so that the basic feasible solutions in these new coordinates are 
8 + 0 0 
=D 10] =] 0], F=D'i8sj=]4 
0 0 0 0 
0 0 
and ¥=D"'jo| =]0], 
8 4 
as depicted in Fig. 9.6.) 
Step 2: 
3 
Ã = AD = Ř, 3,2] and €=De= {7 
0 
Step 3: 
130 SoL 2 At 
18 18 79 ™ 12 
P=|-% # -#| and co =| 
—2 _14 37 _4L 
9 45 45 15 








l _— 
ao 
16 
Ue. 





(0. = ` 0) optimal 


ï, 
Figure 9.6 Example after rescaling for iteration 2. 


Step 4: |-43| > |—13|, so v = $5, so 321 


1 -H 273 0.83 Other Algorithms for 
X= + ae 133 | = | 481] ~| 1.40]. Linear Programming 
1 1 | -i3 2 0.50 
Step 5: 
Gee 2.08 





bo 





8227 =~ | 4.92 
1 1.00 


> 
Il 
J 
x 
Ii 
fez} 


is the trial solution for iteration 3. 

Since there is little to be learned by repeating these calculations for additional 
iterations, we shall stop here. However, we do show in Fig. 9.7 the reconfigured 
feasible region after rescaling based on the trial solution just obtained for iteration 3. 
As always, the rescaling has placed the trial solution at (%,, &, &) = (1, 1, 1), 
equidistant from the £ = 0, ¥, = 0, and x, = 0 constraint boundaries. Note in Figs. 
9.5, 9.6, and 9.7 how the sequence of iterations and rescaling have the effect of 
“‘sliding’’ the optimal solution toward (1, 1, 1) while the other basic feasible solutions 
tend to slide away. Eventually, after enough iterations, the optimal solution will lie 
very near (£, %,, %,) = (0, 1, 0) after rescaling, while the other two basic feasible 











1.63 
(0, 1.63, 0) optimal 


%, 


Figure 9.7 Example after rescaling for iteration 3. 
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Figure 9.8 Trajectory of the interior-point algorithm for the example in the original x, = x, coordinate 
system. 


solutions will be very far from the origin on the £, and £, axes. Step 5 of that iteration 
then will yield a solution in the original coordinates very near the optimal solution of 
(xi; X2, X3) = (0, 8, 0). 

Figure 9.8 shows the progress of the algorithm in the original x, =x, coordinate 
system before augmenting the problem. The three points—(x,, x.) = (2, 2), 
(2.5, 3.5), and (2.08, 4.92)—are the trial solutions for initiating iterations 1, 2, and 
3, respectively. We then have drawn a smooth curve through and beyond these points 
to show the trajectory of the algorithm in subsequent iterations as it approaches (x,, x3) 
= (0, 8). 

The functional constraint for this particular example happened to be an inequality 
constraint. However, equality constraints cause no difficulty for the algorithm, since 
it deals with the constraints only after any necessary augmenting has been done to 
convert them to equality form, Ax = b, anyway. To illustrate, suppose that the only 
change in the example is that the constraint, x, + x, = 8, is changed to x, + x, = 
8. Thus the feasible region in Fig. 9.3 changes to just the line segment between 
(8, 0) and (0, 8). Given an initial feasible trial solution in the interior (x, > 0 and 
Xx, > 0) of this line segment—say, (x,, x.) = (4, 4)—the algorithm can proceed just 
as presented in the five-step summary with just the two variables and A = [1 1). 
For each iteration, the projected gradient points along this line segment in the direction 
of (0, 8). With a = 4, iteration 1 leads from (4, 4) to (2, 6), iteration 2 leads from 
(2, 6) to (1, 7), etc. (Problem 29 asks you to verify these results.) 

Although either version of the example has only one functional constraint, hav- 
ing more than one leads to just one change in the procedure as already illustrated 
(other than more extensive calculations). Having a single functional constraint in the 
example meant that A had only a single row, so the (AA™)~! term in step 3 only 
involved taking the reciprocal of the number obtained from the vector product (AA™). 
Multiple functional constraints means that A has multiple rows, so then the (AAT)~! 


i 


term involves finding the inverse of the matrix obtained from the matrix product 
(AA®). 

To conclude, we need to add a comment to place the algorithm into better 
perspective. For our extremely small example, the algorithm requires relatively ex- 
tensive calculations, and then, after many iterations, obtains only an approximation 
of the optimal solution. By contrast, the graphical procedure of Sec. 3.1 finds the 
optimal solution in Fig. 9.3 immediately, and the simplex method requires only one 
quick iteration. However, do not let this contrast fool you into downgrading the 
efficiency of the interior-point algorithm. This algorithm is designed for dealing with 
big problems having many hundreds or thousands of functional constraints. The sim- 
plex method typically requires thousands of iterations on such problems. By ‘‘shoot- 
ing’’ through the interior of the feasible region, the interior-point algorithm tends to 
require a substantially smaller number of iterations (although with considerably more 
work. per iteration). Therefore, as discussed in Sec. 4.9, interior-point algorithms 
similar to the one presented here should play an important role in the future of linear 
programming. 


9.5 Conclusions 


The upper bound technique provides a way of streamlining the simplex method for 
the common situation in which many or all of the variables have explicit upper bounds. 
It can greatly reduce the computational effort for large problems. 

The dual simplex method and parametric linear programming are especially 
valuable for sensitivity analysis, although they also can be very useful in other contexts 
as well. 

Mathematical-programming computer packages usually include all three of these 
procedures, and they are widely used. Because their basic structure is based largely 
upon the simplex method as presented in Chap. 4, they retain the exceptional com- 
putational efficiency to handle very large problems of the sizes described in Sec. 4.8. 

Various other special-purpose algorithms also have been developed to exploit 
the special structure of particular types of linear programming problems (such as those 
discussed in Chap. 7). Much research is currently being done in this area. 

Karmarkar’s interior-point algorithm has been an exciting new development in 
linear programming. This algorithm and its variants hold much promise as a powerful 
new approach for efficiently solving some very large problems. 
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PROBLEMS 


1. Use the upper bound technique to solve the Wyndor Glass Co. problem presented in 
Sec. 3.1. 


2. Consider the following problem. 


Maximize Z = 2x, + Xs, 


subject to xX, —~ XS 5 
xi = 10 
x, = 10 

and x, 20, x, Z 0. 


(a) Solve this problem graphically. l 
(b) Use the upper bound technique to solve this problem. 
(c) Trace graphically the path taken by the upper bound technique. 


3.* Use the upper bound technique to solve the following problem. 
Maximize Z = Xj + 3x, — 2x3, 
subject to Xa — 2x; = 1 
2x, +X, + 2x, = 8 
x,=1 
xX, = 3 
x32 
and x, = 0, x, 20, x3=0. 
4. Use the upper bound technique to solve the following problem. 
Maximize Z = 2x, + 3x, — 2x3 + 5x4, 
subject to 2x, + 2x, + x, + 2x455 
xX, + 2x, — 3x3 + 4x,=5 
and Osx, =1, for j = 1, 2,3, 4. 


5. Use the upper bound technique to solve the linear. programming model given in 
Prob. 4(b), Chap. 6. 


6. Consider the following problem. 
Maximize Z= =X% — Xo, 
subject to xX, $x, 58 
X, = 3 
=J + xX, 2 


and x 20, x, 20. 


(a) Solve this problem graphically. 325 
(b) Use the dual simplex method to solve this problem. Other Algorithms for 
(c) Trace graphically the path taken by the dual simplex method. Linear Programming 


7. Use the dual simplex method to solve each of the following linear programming 
models: 

(a) Model given in Prob. 17, Chap. 4. 

(b) Model given in Prob. 26, Chap. 4. 


8.* Use the dual simplex method to solve the following problem. 
Minimize Z = 5x, + 2x, + 4x3, 
subject to 3x, + x, + 2x,;2 4 
6x, + 3x, + 5x3; = 10 
and x, 20, x, = 0, x3 20. 
9. Use the dual simplex method to solve the following problem. 
Minimize Z = Tx, + 2x, + 5x3 + 4x4, 
subject to 2x, + 4x, + 7x3 + 2,25 
8x, + 4x, + 6x, + 4x, 2 8 
3x, + 8x. + x3 + 4x,24 
and x = 0, for j= 1,2, 3,4. 
10. Consider the following problem. 
Maximize Z = 3x, + 2x, 
subject to 3x, + x, = 12 
xX, + ys 6 
5x, + 3x, = 27 
and x, 20, Xx, 2 0. 


(a) Solve by the original simplex method (in tabular form). Identify the complementary 
basic solution for the dual problem obtained at each iteration. 

(b) Solve the dual of this problem by the dual simplex method. Compare the resulting 
sequence of basic solutions with the complementary basic solutions obtained in 
part (a). 


11. Consider the example for case 1 of sensitivity analysis given in Sec. 6.7, where the 
initial simplex tableau of Table 4.8 is modified by changing b, from 12 to 24, thereby changing 
the respective entries in the right-side column of the final simplex tableau to 54, 6, 12, and 
—2. Starting from this revised final simplex tableau, use the dual simplex method to obtain the 
new optimal solution shown in Table 6.20. Show your work. 


12.* Consider Prob. 35 (a) and (b), Chap. 6. Use the dual simplex method to obtain 
the new optimal solution for each of these two cases. 


13. Use both the upper bound technique and the dual simplex method to solve the 
following problem. 


Minimize Z = 3x, + 4x, + 2x3, 
subject to Xx + x 215 


X, + x; = 10 
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and 0=x, = 25, 0=%x,=5, 0 sxs 15. 

14. Use both the upper bound technique and the dual simplex method to solve the 
Nori & Leets Co. problem given in Sec. 3.4 for controlling air pollution. 

15.* Consider the following problem. 

Maximize Z = 8x, + 24x, 
subject to xX, + 2x, = 10 
2x, + Xx, = 10 
and x, 20, xX, = 0. 
Suppose that Z represents profit and that it is possible to modify the objective function somewhat 
by an appropriate shifting of key personnel between the two activities. In particular, suppose 
that the unit profit of activity 1 can be increased above 8 (to a maximum of 18) at the expense 
of decreasing the unit profit of activity 2 below 24 by twice the amount. Thus Z can actually 
be represented as 
ZB) = (8 + 0x, + (24 — 26)x, 


where @ is also a decision variable such that 0 = 0 = 10. 

(a) Solve the original form of this problem graphically. Then extend this graphical 
procedure to solve the parametric extension of the problem; i.e., find the optimal 
solution and the optimal value of Z(@) as a function of 0, for 0 = 0 = 10. 

(b) Find the optimal solution to the original form of the problem by the simplex method. 
Then use parametric linear programming to find the optimal solution and the optimal 
value of Z(@) as a function of 6, for 0 = 6 = 10. Plot Z(6). 

(c) Determine the optimal value of 6. Then indicate how this optimal value could have 
been identified directly by solving only two ordinary linear programming problems. 
(Hint: A convex function achieves its maximum at an endpoint.) 


16. Use parametric linear programming to find the optimal solution of the following 
problem as a function of 0, for 0 = @ = 20. 


Maximize Z(0) = (20 + 40)x, + (30 — 30)x, + 5x3, 
subject to 3x, + 3x, + x3 = 30 
8x, + 6x, + 4x, = 75 
6x, + xX, + x3=45 
and x,=0, xa = 0, x32 0. 
17. Consider the following problem. 
Maximize Z(6) = (10 — Ox, + (12 + 0)x; + (7 + 20)x3, 
subject to xı + 2x, + 2x; = 30 
Xx, + x + x; 20 
and x, 20, xX, = 0, x32 0. 


(a) Use parametric linear programming to find the optimal solution for this problem as 
a function of 9, for 8 = 0. 

(b) Construct the dual model for this problem. Then find the optimal solution for this 
dual problem as a function of 0, for 0 = 0, by the method described in the latter 
part of Sec. 9.3. Indicate graphically what this algebraic procedure is doing. Com- 
pare the basic solutions obtained with the complementary basic solutions obtained 
in part (a). 


18. Consider Prob. 43, Chap. 6. Use parametric linear programming to find the optimal 327 
solution as a function of 6, for 0 = 0. Other Algorithms for 


19. Consider Prob. 45(b), Chap. 6. Extend the parametric linear programming proce- Linear Programming 
dure for making systematic changes in the c; parameters to consider also systematic changes in 
the a,; parameters in order to find the optimal solution as a function of 6, for 0 = 0 = 1. 


20.* Use the parametric linear programming procedure for making systematic changes 
in the b; parameters to find the optimal solution for the following problem as a function of 0, 
for 0 = @= 25. 


Maximize Z(0) = 2x, + 2x3, 
subject to x, = 10 + 20 
xX, $+%,=25- 6 
xX, = 10 + 20 
and x, 20, xX, = 0. 
Indicate graphically what this algebraic procedure is doing. 


21. Use parametric linear programming to find the optimal solution for the following 
problem as a function of 6, for 0 = 0 = 30. 


Maximize Z(@) = 5x, + 6x, + 4x3 + 7x4, 
subject to 3x, = 2x, + x3 + 3x, 135 — 26 
2x, + 4x, — x3 + 2xyS 78 - O 
xX, + 2x, + x3 + 2x45 30+ 0 
and x; = 0, for j= 1,2, 3,4. 
Then identify the value of @ that gives the largest optimal value of Z(0). 


22. Consider Prob. 36, Chap. 6. Use parametric linear programming to find the optimal 
solution as a function of 0 over the following ranges of 0. 

(a) 0= 620. 

(b) —20 = 6< 0. (Hint: Substitute — 6’ for 0, and then increase @’ from zero.) 


23. Follow the instructions of Prob. 22 for Prob. 40, Chap. 6. 


24. Consider the Z*(@) function shown in Fig. 9.1 for parametric linear programming 
with systematic changes in the c; parameters. 

(a) Explain why this function is piecewise linear. 

(6) Show that this function must be convex. 


25. Consider the Z*(0) function shown in Fig. 9.2 for parametric linear programming 
with systematic changes in the b; parameters. 

(a) Explain why this function is piecewise linear. 

(b) Show that this function must be concave. 


26. Let 
Z* = max [š cai} 
j=l 
subject to = jx; = ba for i= 1,2,...,m, 
j=l 
and x; 20, for J = 1,2, 7 
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(where the a,, b;, and c; are fixed constants), and let (y7, y3, . . . , Yn) be the corresponding 
optimal dual solution. Then let 


n 
xk — 
Z max { > cash 
js 


subject to = Š ayb; + kp for i=1,2,,...,m, 
j=1 

and x; = 0, for j= 1,2,,...,4", 

where k,, k,,..., Km are given constants. Show that 


zsz S byt 
i=l 


27. Reconsider the example used to illustrate the interior-point algorithm in Sec. 9.4. 
Suppose that (x,, x.) = (1, 3) were used instead as the initial feasible trial solution. Perform 
two iterations starting from this solution. 


28. Consider the following problem. 
Maximize Z = 3x, + Xo, 
subject to xX, $x, 4 
and x, 20, Xx = 0. 


(a) Solve this problem graphically. Also identify all corner-point feasible solutions. 

(b) Starting from the initial trial solution (x,, x,) = (1, 1), perform four iterations of 
the interior-point algorithm presented in Sec. 9.4. 

(c) Draw figures corresponding to Figs. 9.4, 9.5, 9.6, 9.7, and 9.8 for this problem. 
In each case, identify the basic (or corner-point) feasible solutions in the current 
coordinate system. (Trial solutions can be used to determine projected gradients.) 


29. Consider the following problem. 
Maximize Z =X, + 2x, 
subject to xX, +x, = 8 
and x, 20, xX, 20. 


(a) Near the end of Sec. 9.4, there is a discussion of what the interior-point algorithm 
does on this problem when starting from the initial feasible trial solution (x,, x.) = 
(4, 4). Verify the results presented there by performing two iterations. 

(b) Use these results to predict what subsequent trial solutions would be if additional 
iterations were to be performed. 

(c) Suppose that the stopping rule adopted for the algorithm in this application is that 
the algorithm stops when two successive trial solutions differ by no more than 0.01 
in any component. Use your predictions from part (b) to predict the final trial solution 
and the total number of iterations required to get there. How close would this solution 
be to the optimal solution (4), x2) = (0, 8)? 


30. Consider the following problem. 
Maximize Z= Xx, + Xo, 
subject to xX, + 2x, =9 
2x, + x59 


and x, = 0, x, 20. 


(a) Solve the problem graphically. 329 

(b) Find the gradient of the objective function in the original % =X coordinate system. Other Algorithms for 
If you move from the origin in the direction of the gradient until you reach the 
boundary of the feasible region, where does it lead relative to the optimal solution? 

(c) Starting from the initial trial solution (x,, x.) = (1, 1), perform 10 iterations of the 
interior-point algorithm presented in Sec. 9.4. 

(d) Repeat part (c) with œ = 0.9. 


Linear Programming 


31. Consider the following problem. 
Maximize Z = 2x, + 5x, + 7x3, 
subject to xX, + 2x, + 3x3 = 
and x, 20, x, 20, x32 0. 


(a) Graph the feasible region. 

(b) Find the gradient of the objective function and then find the projected gradient onto 
the feasible region. 

(c) Starting from the initial trial solution (x,, x, x3) = (1, 1, 1), perform two iterations 
of the interior-point algorithm presented in Sec. 9.4. 

(d) Perform eight additional iterations. 


32. Starting from the initial trial solution (x,, x.) = (2, 2), apply 15 iterations of the 
interior-point algorithm presented in Sec. 9.4 to the Wyndor Glass Co. problem presented in 
Sec. 3.1. Also draw a figure like Fig. 9.8 to show the trajectory of the algorithm in the original 
x, =x, coordinate system. 
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Networks arise in numerous settings and in a variety of guises. Transportation, elec- 
trical, and communication networks pervade our daily lives. Network representations 
also are widely used for problems in such diverse areas as production, distribution, 
project planning, facilities location, resource management, and financial planning — 
to name just a few examples. In fact, a network representation provides such a pow- 
erful visual and conceptual aid for portraying the relationships between the components 
of systems that it is used in virtually every field of scientific, social, and economic 
endeavor. 

One of the most exciting developments in operations research in recent years 
has been the unusually rapid advance in both the methodology and application of 
network optimization models. A number of algorithmic breakthroughs in the 1970s 
and 1980s have had a major impact, as have ideas from computer science concerning 
data structures and efficient data manipulation. Consequently, algorithms and software 
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now are available and are being used to solve huge problems. on a routine basis that 
would have been completely intractable a couple of decades ago. 

As one example of a recent application, an award-winning study (see Selected 
References 8 and 9) was conducted during the mid-1980s at Citgo: Petroleum Cor- 
poration, which is devoted to petroleum refining and marketing operations and has 
annual sales of several billion dollars. When Citgo was acquired by Southland Cor- 
poration (best known for its 7-Eleven stores) in 1983, top management saw an urgent 
need for a modeling system to help Citgo overcome the pressures of volatile crude 
oil prices and a 30-fold increase in the costs of financing working capital. The oper- 
ations research team developed an optimization-based decision support system using 
network methodology, and coupled this system with an on-line corporate database. 
The model takes in all aspects of the business, helping management decide everything 
from run levels at the various refineries to what prices to pay or charge. A network 
representation is essential because: of the flow of goods through several stages: pur- 
chase of crude oil from various suppliers, shipping it to refineries, refining it into 
various products, and sending the products to distribution centers and product storage 
terminals for subsequent sale. The modeling system enabled the company to reduce 
its petroleum products inventory by $116 million. This has meant a savings in annual 
interest of $14 million, and improvements in coordination, pricing, and purchasing 
decisions worth another $2.5 million each year. 

In this Citgo study, the model used for each product fits the model for the 
minimum cost flow problem presented in Sec. 10.6. Each product’s model has about 
3,000 equations (nodes) and 15,000 variables (arcs), which is of very modest size by 
today’s standards for the application of network optimization models. 

In this one chapter we shall be able only to scratch the surface of the current 
state of the art of network methodology. However, we shall introduce you to five 
important kinds of network problems and some basic ideas of how to solve them 
(without delving into issues of data structures, etc., that are so vital to successful 
large-scale implementations). Each of the first three problem types—the shortest path 
problem, the minimum spanning tree problem, and the maximum flow problem—has 
a very specific structure that arises frequently in applications. 

The fourth type—the minimum cost flow problem—provides a unified approach 
to many other applications because of its far more general structure. In fact, this 
structure is so general that it includes as special cases both the shortest path problem 
and the maximum flow problem, as well as the transportation problem, the transship- 
ment problem, and the assignment problem from Chap. 7. Like many network optim- 
ization models, the minimum cost flow problem can be formulated as a linear pro- 
gramming problem, and it can be solved extremely efficiently by a streamlined version 
of the simplex method called the network simplex method. (We shall not discuss even 
more general network problems that are more difficult to solve.) 

The last problem type considered. is project planning and control with PERT 
(Program Evaluation and Review Technique) and CPM (Critical Path Method). Al- 
though limited to this one area of application, PERT and CPM have proven to be 
invaluable tools there. In fact, since their development in the late 1950s, PERT and 
CPM have been (and probably continue to be) the most widely used kind of network 
technique in operations research. 

The first section introduces a prototype example that will be used subsequently 
to illustrate the approach to the first three of these problems. Section 10.2 presents 


some basic terminology for networks. The next four sections deal with the first four 
problems in turn. Section 10.7 then is devoted to the network simplex method, and 
Sec. 10.8 discusses the last problem type. 


10.1 Prototype Example 


SEERVADA PARK has recently been set aside for a limited amount of sightseeing 
and backpack hiking. Cars are not allowed into the park, but there is a narrow, winding 
road system for trams and for jeeps driven by the park rangers. This road system is 
shown (without the curves) in Fig. 10.1, where location O is the entrance into the 
park; other letters designate the locations of ranger stations (and other limited facili- 
ties). The numbers give the distances of these winding roads in miles. 

The park contains a scenic wonder at station T. A small number of trams are 
used to transport sightseers from the park entrance to station T and back. 

The park management currently faces three problems. One is to determine which 
route from the park entrance to station T has the smallest total distance for the op- 
eration of the trams. (This is an example of the shortest path problem to be discussed 
in Sec. 10.3.) 

A second problem is that telephone lines must be installed under the roads to 
establish telephone communication among all the stations (including the park en- 
trance). Because the installation is both expensive and disruptive to the natural en- 
vironment, lines will be installed under just enough roads to provide some connection 
between every pair of stations. The question is where the lines should be laid to 
accomplish this with a minimum total number of miles of line installed. (This is an 
example of the minimum spanning tree problem to be discussed in Sec. 10.4.) 

The third problem is that more people want to take the tram ride from the park 
entrance to station T than can be accommodated during the peak season. To avoid 
unduly disturbing the ecology and wildlife of the region, a strict ration has been placed 
on the number of tram trips that can be made on each of the roads per day. (These 
limits differ for the different roads, as we shall describe in detail in Sec. 10.5.) 
Therefore, during the peak season, various routes might be followed regardless of 
distance to increase the number of tram trips that can be made each day. The question 





Figure 10.1 The road system for Seervada Park. 
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336 is how to route the various trips to maximize the number of trips that can be made 
Mathematical per day without violating the limits on any individual road. (This is an example of 
Programming the maximum flow problem to be discussed in Sec. 10.5.) 


10.2 The Terminology of Networks 


A relatively extensive terminology has been developed to describe the various kinds 
of networks and their components. Although we have avoided as much of this special 
vocabulary as we could, we still need to introduce a considerable number of terms 
for use throughout the chapter. We suggest that you read through this section once at 
the outset to understand the definitions, and then plan to return to refresh your memory 
as the terms are used in subsequent sections. To assist you, each term is highlighted 
in boldface at the point where it is defined. 

A network consists of a set of points and a set of lines connecting certain pairs 
of the points. The points are called nodes (or vertices); e.g., the network in Fig. 10.1 
has seven nodes designated by the seven circles. The lines are called ares (or links 
or edges or branches); e.g., the network in Fig. 10.1 has 12 arcs corresponding to 
the 12 roads in the road system. Arcs are labeled by naming the nodes at either end; 
e.g., AB is the arc between nodes A and B in Fig. 10.1. 

The arcs of a network may have a flow of some type through them, e.g., the 
flow of trams on the roads of Seervada Park in Sec. 10:1. Table 10.1 gives several 
examples of flow in typical networks. If flow through an arc is allowed in only one 
direction (e.g., a one-way street), the arc is said to be a directed arc. The direction 
is indicated by adding an arrowhead at the end of the line representing the arc. When 
labeling a directed arc by listing two nodes it connects, the from node always is given 
before the to node; e.g., an arc that is directed from node A to node B must be labeled 
as AB rather than BA. Alternatively, this arc may be labeled as A — B. 

If the flow through an arc. is allowed in both directions (e.g., a two-way street), 
the arc is said to be an undirected arc. To help you distinguish between the two 
kinds of arcs, we shall frequently refer to undirected arcs by the suggestive alternative 
name of links. 

A network that has only directed arcs is called a directed network. Similarly, 
if all of its arcs are undirected, the network is said to be an undirected network. A 
network with a mixture of directed and undirected arcs (or even all undirected arcs) 
can be converted into a directed network, if desired, by replacing each undirected arc 
by a pair of directed arcs in opposite directions. 

When two nodes are not connected by an arc, a natural question is whether they 
are connected by a series of arcs. A path between two nodes is a sequence of distinct 
arcs connecting these nodes. For example, one of the paths connecting nodes O and 


Table 10.1 Components of Typical Networks 





Nodes 










Intersections Roads Vehicles 
Airports Air lanes Aircraft 
Switching points Wires, channels Messages 
Pumping stations | Pipes Fluid 
Work centers Materials-handling routes | Jobs 















Figure 10.2 Example of a directed network. 


T in Fig. 10.1 is the sequence of arcs OB-BD—DT (O —> B —> D —> T), or vice versa. 
When some or all of the arcs in the network are directed arcs, we then distinguish 
between directed paths and undirected paths. A directed path from node i to node j 
is a sequence of connecting arcs whose direction (if any) is toward node j, so that 
flow from node i to node j along this path is feasible. An undirected path from node 
i to node j is a sequence of connecting arcs whose direction (if any) can be either 
toward or away from node j. (Notice that a directed path also satisfies the definition 
of an undirected path, but not vice versa.) Frequently, an undirected path will have 
some arcs directed toward node j but others directed away (i.e., toward node i). You 
will see in Secs. 10.5 and 10.6 that, perhaps surprisingly, undirected paths play a 
major role in the analysis of directed networks. 

To illustrate these definitions, Fig. 10.2 shows a typical directed network. The 
sequence of arcs AB~BC-CE (A —> B —> C —> E) is a directed path from node A to 
node Æ, since flow toward node E along this entire path is feasible. On the other hand, 
BC-AC-—AD (B — C — A — D) is not a directed path from node B to node D, because 
the direction of arc AC is away from node D (on this path). However, B =~ C — 
A — D is an undirected path from node B to node D. As an example of the relevance 
of this undirected path, suppose that 2 units of flow from node A to node C had 
previously been assigned to arc AC. Given this previous assignment, it now is feasible 
to assign a smaller flow—say, 1 unit—to the entire undirected path B —> C —> A — 
D from node B to node D, because this involves reducing the flow on arc AC by 1 
unit. Reducing a previously assigned flow in the ‘‘wrong direction’’ when adding a 
flow to an undirected path will prove to be a key concept in Secs. 10.5 and 10.6. 

A path that begins and ends at the same node is called a cycle. In a directed 
network, a cycle is either a directed or an undirected cycle, depending on whether 
the path involved is a directed or an undirected path. (Since a directed path also is 
an undirected path, a directed cycle is an undirected cycle, but not vice versa in 
general.) In Fig. 10.2, for example, DE-ED is a directed cycle. By contrast, 
AB-BC-AC is not a directed cycle, because the direction of arc AC opposes the 
direction of arcs AB and BC. On the other hand, AB—BC-—AC is an undirected cycle, 
because A — B — C -> A is an undirected path. In the undirected network shown 
in Fig. 10.1, there are many cycles, e.g., OA-AB-BC-CO. However, note that the 
definition of path (a sequence of distinct arcs) rules out retracing one’s steps in forming 
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a cycle. For example, OB~BO in Fig. 10.1 does not qualify as a cycle, because OB 
and BO are two labels for the same arc (link). On the other hand, DE—ED is a (directed) 
cycle in Fig. 10.2, because DE and ED are distinct arcs. 

Two nodes are said to be connected if the network contains at least one undi- 
rected path between them. (Note that the path does not need to be directed even if 
the network is directed.) A connected network is a network where every pair of 
nodes is connected. Thus the networks in Figs. 10.1 and 10.2 are both connected. 
However, the latter network would not be connected if arcs AD and CE were removed. 

Consider a set of n nodes (e.g., the n = 5 nodes in Fig. 10.2) without any 
arcs. A “‘tree’’ can then be “‘grown’’ by adding one arc (or branch) at a time in a 
certain way. The first arc can go anywhere to connect some pair of nodes. Thereafter, 
each new arc should be between a node that already is connected to other nodes and 
a new node not previously connected to any other nodes. Adding an arc in this way 
avoids creating a cycle, and also ensures that the number of connected nodes is one 
greater than the number of arcs. Each new arc creates a larger tree, which is a 
connected network (for some subset of the n nodes) that contains no undirected cycles. 
Once the (n — 1)st arc has been added, the process stops because the resulting tree 
spans (connects) all n nodes. This tree is called a spanning tree, i.e., a connected 
network for all n nodes that contains. no undirected cycles. Every spanning tree has 
exactly (n — 1) arcs, since this is the minimum number of arcs needed to have a 
connected network and the maximum number possible without having undirected 
cycles. 

Figure 10.3 uses the five nodes and some of the arcs of Fig. 10.2 to illustrate 
this process of growing a tree one arc (branch) at a time until a spanning tree has 
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Figure 10.3 Example of growing a tree one arc at a time for the network of Fig. 10.2. (a) The nodes 


without arcs; (b) a tree with one arc; (c) a tree with two arcs; (d) a tree with three arcs; (e) a spanning 
tree. 


been obtained. There are several alternative choices for the new arc at each stage of 
the process, so Fig. 10.3 shows only one of many ways to construct a spanning tree 
in this case. Note, however, how each new added arc satisfies the conditions specified 
in the preceding paragraph. We shall discuss and illustrate spanning trees further in 
Sec. 10.4. 

Spanning trees play a key role in the analysis of many networks. For example, 
they form the basis for the minimum spanning tree problem discussed in Sec. 10.4. 
Another prime example is that (feasible) spanning trees correspond to the basic feasible 
solutions for the network simplex method discussed in Sec. 10.6. 

Finally, we shall need a little additional terminology about flows in networks. 
The maximum amount of flow (possibly infinity) that can be carried on a directed arc 
is referred to as the arc capacity. For nodes, a distinction is made among those that 
are net generators of flow, net absorbers of flow, or neither. A supply node (or source 
node or source) has the property that the flow out of the node exceeds the flow into 
the node. The reverse case is a demand node (or sink node or sink), where the flow 
into the node exceeds the flow out of the node. A transshipment node (or intermediate 
node) satisfies conservation of flow, so flow in equals flow out. 


10.3 The Shortest Path Problem 


Although several other versions of the shortest path problem (including some for 
directed networks) are mentioned at the end of the section, we shall focus on the 
following simple version. Consider an undirected and connected network with two 
special nodes called the origin and the destination. Associated with each of the links 
(undirected arcs) is a nonnegative distance. The objective is to find the shortest path 
(the path with the minimum total distance) from the origin to the destination. 

A relatively straightforward algorithm is available for this problem. The essence 
of this procedure is that it fans out from the origin, successively identifying the shortest 
path to each of the nodes of the network in the ascending order of their (shortest) 
distances from the origin, thereby solving the problem when the destination node is 
reached. We shall first outline the method and then illustrate it by solving the shortest 
path problem encountered by the Seervada Park management in Sec. 10.1. 


Algorithm for Shortest Path Problem 


Objective of nth iteration: Find nth nearest node to origin. (To be repeated for 
n= 1,2,..., until nth nearest node is the destination.) 


Input for nth iteration: (n — 1) nearest nodes to origin (solved for at previous 
iterations), including their shortest path and distance from the origin. (These 
nodes, plus the origin, will be called solved nodes; the others are unsolved 
nodes.) 


Candidates for nth nearest node: Each solved node that is directly connected 
by a link to one or more unsolved nodes provides one candidate—the un- 
solved node with the shortest connecting link. (Ties provide additional can- 
didates. ) 


Calculation of nth nearest node: For each such solved node and its candidate, 
add the distance between them and the distance of the shortest path from the 
origin to this solved node. The candidate with the smallest such total distance 
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Table 10.2 . Applying Algorithm. for Shortest Path Problem to Seervada Park Problem 
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is the nth nearest node (ties provide additional solved nodes), and its shortest 
path is the one generating this distance. 


EXAMPLE: The Seervada Park management needs to find the shortest path from the 
park entrance (node O) to the scenic wonder (node T) through the road system shown 
in Fig. 10.1. Applying the preceding algorithm to this problem yields the results 
shown in Table 10.2 (where the tie for the second nearest node allows skipping directly 
to seeking the fourth nearest node next). The first column (n) indicates the iteration 
count. The second column simply lists the solved nodes for beginning the current 
iteration after deleting the irrelevant ones (those not connected directly to any unsolved 
node). The third column then gives the candidates for the nth nearest node (the 
unsolved nodes with the shortest connecting link to a solved node). The fourth column 
calculates the distance of the shortest path from the origin to each of these candidates 
(namely, the distance to the solved node plus the link distance to the candidate). The 
candidate with the smallest such distance is the nth nearest node to the origin, as listed 
in the fifth column. The last two columns summarize the information for this newest 
solved node that is needed to proceed to subsequent iterations (namely, the distance 
of the shortest path from the origin to this node and the last link on this shortest path). 

The shortest path from the destination to the origin can now be traced back 
through the last column of Table 10.2 as either T > D — E > B > A—>Oor 
T —> D —> B —> A —> O. Therefore, the two alternates for the shortest path from the 
origin to the destination have been identified as O —> A —> B > E —> D —> T and 
O — A —> B —> D —> T, with a total distance of 13 miles on either path. 





Other Applications 


Before concluding this discussion of the shortest path problem, we need to emphasize 
one point. The problem thus far has been described in terms of minimizing the distance 
from the origin to the destination. However, in actuality the network problem being 
solved is finding which path connecting two specified nodes minimizes the sum of 
the link values on the path. There is no reason that these link values need to represent 
distances, even indirectly. For example, the links might correspond to activities of 


some kind, where the value associated with each link is the cost of that activity. The 
problem then would be to find which sequence of activities that accomplishes a speci- 
fied objective minimizes the total cost involved. (See Prob. 2.) Another alternative is 
that the value associated with each link is the time required for that activity. The 
problem then would be to find which sequence of activities that accomplishes a speci- 
fied objective minimizes the total time involved. (See Prob. 6.) Thus some of the most 
important applications of the shortest path problem have nothing to do with distances. 

Many of these applications require finding the shortest directed path from the 
origin to the destination through a directed network. The algorithm already presented 
can be easily modified to deal just with directed paths at each iteration. In particular, 
when identifying candidates for the nth nearest node, only directed arcs from a solved 
node to an unsolved node would be considered. 

Another version of the shortest path problem is to find the shortest paths from 
the origin to all of the other nodes of the network. Notice that the algorithm already 
solves for the shortest path to each node that is closer to the origin than the destination. 
Therefore, when all nodes are potential destinations, the only modification needed in 
the algorithm is that it does not stop until all of the nodes are solved nodes. 

An even more general version of the shortest path problem is to find the shortest 
paths from every node to every other node. Another option is to drop the restriction 
that ‘‘distances’’ (arc values) be nonnegative. Constraints also can be imposed on the 
paths that can be followed. All of these variations occasionally arise in applications, 
and so have been studied by researchers. 

The algorithms for a wide variety of combinatorial optimization problems, such 
as certain vehicle routing or network design problems, often call for the solution of 
a large number of shortest path problems as subroutines. Although we lack the space 
to pursue this topic further, this use may now be the most important kind of application 
of the shortest path problem. 


10.4 The Minimum Spanning Tree Problem 


The minimum spanning tree problem bears some similarities to the main version of 
the shortest path problem presented in the preceding section. In both cases, an undi- 
rected network is being considered, where the given information includes the nodes 
and the distances! between pairs of nodes. However, the crucial difference for the 
minimum spanning tree problem is that the Jinks (undirected arcs) between the nodes 
are no longer specified. Thus, rather than finding a shortest path through a fully defined 
network, the problem involves choosing for the network the links that have the shortest 
total length while providing a path between each pair of nodes. The links need to be 
chosen in such a way that the resulting network forms a tree (as defined in Sec. 10.2) 
that spans (connects) all the given nodes. In short, the problem is to find the spanning 
tree with a minimum total length of the links. 

Figure 10.4 illustrates this concept of a spanning tree for the Seervada Park 
problem (see Sec. 10.1). Thus Fig. 10.4a is not a spanning tree because the 
(O, A, B, C) nodes are not connected with the (D, E, T) nodes. It needs another link 
to make this connection. This network actually consists of two trees, one for each of 
these two sets of nodes. The links in Fig. 10.46 do span the network (i.e., the network 


' Once again, ‘‘distance’’ instead can be cost, time, or some other quantity. 
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(a) 





(b) 





(c) 


Figure 10.4 Illustrations of the spanning tree concept for the Seervada Park problem. (a) Not a 
spanning tree; (b) not a spanning tree; (c) a spanning tree. 


is connected as defined in Sec. 10.2), but it is not a tree because there are two cycles 
(O-A-B-C-—O and D-T-E-D). It has too many links. Because the Seervada Park 
problem has n = 7 nodes, Sec. 10.2 indicates that the network must have exactly 
(n — 1) = 6 links, with no cycles, to qualify as a spanning tree. This condition is 
achieved in Fig. 10.4c, so this network is a feasible solution (with a value of 24 miles 
for the total length of the links) for the minimum spanning tree problem. (You soon 
will see that this solution is not optimal because it is possible to construct a spanning 
tree with only 14 miles of links.) 

This problem has a number of important practical applications. For example, it 
can sometimes be helpful in planning transportation networks that will not be used 
much, where the primary consideration. is to provide some path between all pairs of 
nodes in the most economical way. (See Prob. 8.) The nodes would be the locations 
that require access to the other locations, the branches would be transportation lanes 
(highways, railroad tracks, air lanes, and so forth), and the ‘‘distances’’ (link values) 
would be the costs of providing the transportation lanes. In this context, the minimum 
spanning tree problem is to determine which transportation lanes would service all 


the locations with a minimum total cost. Other examples where a comparable decision 
arises include the planning of large-scale communication networks and distribution 
networks. Both represent important application areas. 

The minimum spanning tree problem can be solved in a very straightforward 
way because it happens to be one of the few operations research problems where 
being greedy at each stage of the solution procedure still leads to an overall optimal 
solution at the end! Thus, beginning with any node, the first stage involves choosing 
the shortest possible link to another node, without worrying about the effect this choice 
would have on subsequent decisions. The second stage involves identifying the un- 
connected node that is closest to either of these connected nodes and then adding the 
corresponding link to the network. This process would be repeated, as per the follow- 
ing summary, until all the nodes have been connected. (Note that this is the same 
process already illustrated in Fig. 10.3 for constructing a spanning tree, but now with 
a specific rule for selecting each new link.) The resulting network is guaranteed to be 
a minimum spanning tree. 


Algorithm for Minimum Spanning Tree Problem 


1. Select any node arbitrarily, and then connect it (1.e., add a link) to the nearest 
distinct node. 

2. Identify the unconnected node that is closest to a connected node, and then 
connect these two nodes (i.e., add a link between them). Repeat this step 
until all nodes have been connected. 


Tie Breaking: Ties for the nearest distinct node (step 1) or the closest un- 
connected node (step 2) may be broken arbitrarily and the algorithm must 
still yield an optimal solution. However, such ties are a signal that there may 
be (but need not be) multiple optimal solutions. All such optimal solutions 
can be identified by pursuing all ways of breaking ties to their conclusion. 


The fastest way of executing this algorithm manually is the graphical approach 
illustrated as follows. 


EXAMPLE: ‘The Seervada Park management (see Sec. 10.1) needs to determine under 
which roads telephone lines should be installed to connect all stations with a minimum 
total length of line. Using the data given in Fig. 10.1, we outline the step-by-step 
solution of this problem next. 

Nodes and distances for the problem are summarized below, where the thin lines 
now represent potential links. 
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The unconnected node closest to either node O or A is node B (closest to A). Connect 
node B to node A. 





The unconnected node closest to node O, A, or B is node C (closest to B). Connect 
node C to node B. 





The unconnected node closest to node O, A, B, or C is node E (closest to B). Connect 345 
node E to node B. 
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The unconnected node closest to node O, A, B, C, or E is node D (closest to £). 
Connect node D to node E. 








The only remaining unconnected node is node T. It is closest to node D. Connect 
node T to node D. 








All nodes are now connected, so this solution to the problem is the desired (optimal) 
one. The total length of the links is 14 miles. 

Although it may appear at first glance that the choice of the initial node will 
affect the resulting final solution (and its total link length) with this procedure, it really 
doesn’t. We suggest you verify this fact for the example by reapplying the algorithm, 
starting with nodes other than node O. 
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The minimum spanning tree problem is the one problem we consider in this 
chapter that falls into the broad category of network design. In this category, the 
objective is to design the most appropriate network for the given application (fre- 
quently involving transportation systems) rather than analyzing an already designed 
network. Selected Reference 10 provides a survey of this important area. 


10.5 The Maximum Flow Problem 


Now recall that the third problem facing the Seervada Park management (see Sec. 
10.1) during the peak season is to determine how: to route the various tram trips from 
the park entrance (station O in Fig. 10.1) to the scenic wonder (station T) to maximize 
the number of trips per day. (Each tram will return by the same route it took on the 
outgoing trip, so the analysis focuses on outgoing trips only.) In order to avoid unduly 
disturbing the ecology and wildlife of the region, strict upper limits have been imposed 
on the number of outgoing trips allowed per day in each direction on each individual 
road. These limits are shown in Fig. 10.5, where the numbers next to each station 
and road give the limit for that road in the direction leading away from that station. 
For example, only one loaded trip per day is allowed from station A to station B, but 
one other also is allowed from station B to station A. Given the limits, one feasible 
solution is to send seven trams per day, with five using the route O —> B > E >T, 
one using O > B > C —> E >T, and one using O > B > C > E >D >T. 
However, because this solution blocks the use of any routes starting with O —> C 
(because the E —> T and E — D capacities are fully used), it is easy to find better 
feasible solutions. Many combinations of routes (and the number of trips to assign to 
each one) need to be considered to find the one(s) maximizing the number of trips 
made per day. This kind of problem is called a maximum flow problem. 

Using the terminology introduced in Sec. 10.2, the maximum flow problem can 
be described formally as follows. Consider a directed and connected network where 
just one node is a supply node, one node is a demand node, and the rest are trans- 
shipment nodes. Given the arc capacities, the objective is to determine the feasible 
pattern of flows through the network that maximizes the total flow from the supply 
node to the demand node. 

To formally fit the Seervada Park problem into this format with a directed 
network, each link in Fig. 10.5 with a 0 at one end would be replaced by a directed 





Figure 10.5 Limits on the number of trips per day for the Seervada Park problem. 


arc in the direction of feasible flow. For example, the link between nodes O and A 
would be replaced by a directed arc from node O to node A with an arc capacity of 
5. The two links with a | at either end (AB and DE) would be replaced by a pair of 
directed arcs in opposite directions, each with an arc capacity of 1. With these un- 
derstandings, we shall continue to operate on the network as shown in Fig. 10.5. 

Because the maximum flow problem can be formulated as a linear programming 
problem (see Prob. 11), it can be solved by the simplex method. However, an even 
more efficient augmenting path algorithm is available for solving this problem. This 
algorithm is based on two intuitive concepts, those of a residual network and of an 
augmenting path. 

At the outset, the residual network differs from the original network only in 
that each directed arc (i — j) lacking a directed arc in the opposite direction (j — i) 
now has such an arc added with zero arc capacity. Subsequently; the arc capacities 
in the residual network (called residual capacities) are adjusted as follows. Each time 
some amount of flow A is added to arc i — j in the original network, the residual 
capacity of arc i —> j is decreased by A but the residual capacity of arc j —> i is 
increased by A. Thus the residual capacity represents the unused arc capacity in the 
original network or the amount of flow in the opposite direction in this network that 
can be cancelled (or a combination of both if the original network has arcs in both 
directions). Therefore, after assigning various flows to the original network, the re- 
sidual network shows how much more can be done either by increasing flows further 
or by cancelling previously assigned flows. 

An augmenting path is a directed path from the supply node to the demand 
node in the residual network such that every arc on this path has strictly positive 
residual capacity. The minimum of these residual capacities is called the residual 
capacity of the augmenting path because it represents the amount of flow that can 
feasibly be added to the entire path. Therefore, each augmenting path provides an 
opportunity to further augment the flow through the original network. 

The augmenting path algorithm repeatedly selects some augmenting path and 
adds a flow equal to its residual capacity to that path in the original network. This 
process continues until there are no more augmenting paths, so the flow from the 
supply node to the demand node cannot be increased further. The key to ensuring that 
the final solution necessarily is optimal is the fact that augmenting paths can cancel 
some previously assigned flows in the original network, so an indiscriminate selection 
of paths for assigning flows cannot prevent the use of a better combination of flow 
assignments. 

To summarize, each iteration of the algorithm consists of the following three 
steps. 


Algorithm for Maximum Flow Problem! 


1. Identify an augmenting path by finding some directed path from the supply 
node to the demand node in the residual network such that every arc on this 
path has strictly positive residual capacity. (If no augmenting path exists; 
the net flows already assigned constitute an optimal flow pattern.) 

2. Identify the residual capacity c* of this augmenting path by finding the 
minimum of the residual capacities of the arcs on this path. Increase the flow 
in this path by c*. 


1 Tt is assumed that the arc capacities are either integers or rational numbers. 
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3. Decrease by c* the residual capacity of each arc on this augmenting path. 
Increase by c* the residual capacity of each arc in the opposite direction on 
this augmenting path. Return to step 1. 


When performing step 1, there often will be a number of alternative augmenting 
paths from which to choose. Although the algorithmic strategy for making this selec- 
tion is of some importance for the efficiency of large-scale implementations, we shall 
not delve into this relatively specialized topic. (Later in the section, we do describe 
a systematic procedure for finding some augmenting path.) Therefore, for the follow- 
ing example (and the problems at the end of the chapter), the selection is just made 
arbitrarily. 


EXAMPLE: Applying this algorithm to the Seervada Park problem (see Fig. 10.5 
for the original network) yields the results summarized next. For each iteration, the 
residual network is shown after completing all three steps, where a single line is used 
to represent the pair of directed arcs in opposite directions between each pair of nodes. 
The residual capacity of arc i —> j is shown next to node i, whereas the residual 
capacity of arc j —> i is shown next to node j. Using this format, the network shown 
in Fig. 10.5 actually is the residual network at the outset, before assigning any flows. 
After subsequent iterations, we show in boldface (next to nodes O and T) the total 
amount of flow achieved thus far. 


Iteration 1: Referring to Fig. 10.5, one of several augmenting paths is O —> B —> 
E — T, which has a residual capacity of min{7, 5, 6} = 5. Assigning a flow of 5 to 
this path, the resulting residual network is 





Iteration 2: Assign a flow of 3 to the augmenting path, O —> A —> D —> T. The 
resulting residual network is 





Iteration 3: Assign a flow of 1 to the augmenting path, O >A > B > D >T. 349 
Iteration 4: Assign a flow of 2 to the augmenting path, O —> B — D —> T. The Network Analysis, 
resulting residual network is Including PERT-CPM 





Iteration 5: Assign a flow of 1 to the augmenting path, O —> C —> E > D >T. 


Iteration 6: Assign a flow of 1 to the augmenting path, O —> C —> E —> T. The 
resulting residual network is 





Iteration 7: Assign a flow of 1 to the augmenting path, O —> C —> E —> B > D > 
T. The resulting residual network is 





There are no more augmenting paths, so the current flow pattern is optimal. 


The current flow pattern may be identified by either cumulating the flow as- 
signments or by comparing the final residual capacities with the original arc capacities. 
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Figure 10.6 Optimal solution for the Seervada Park maximum flow problem. 


If we use the latter method, there is flow along an arc if the final residual capacity is 
less than the original capacity. The magnitude of this flow equals the difference in 
these capacities. Applying this method by comparing the residual network obtained 
from the last iteration with Fig. 10.5 yields the optimal flow pattern shown in Fig. 
10.6. 

This example nicely illustrates the reason for supplementing an arc i — j in the 
original network by an arc j — i in the residual network and then increasing the 
residual capacity of the latter arc by c* when a flow of c* is assigned to arc i — j. 
Without this refinement, the first six iterations would be unchanged. However, at that 
point it would appear that no augmenting paths remain (because the real unused arc 
capacity for E — B is zero). Therefore, the refinement permits adding the flow as- 
signment of 1 for O —> C —> E > B —> D —>T in iteration 7. In effect, this additional 
flow assignment cancels out one unit of flow assigned at iteration 1 (O0 —> B —> E —> 
T) and replaces it by assignments of one unit of flow to both O —> B —> D —> T and 
O-~C>E-T. 

The most difficult part of this algorithm when large networks are involved is 
finding an augmenting path. This task may be simplified by the following systematic 
procedure. Begin by determining all nodes that can be reached from the supply node 
along a (single) arc with strictly positive residual capacity. Then, for each of these 
nodes that were reached, determine all new nodes (those not yet reached) that can be 
reached from this node along an arc with strictly positive residual capacity. Repeat 
this successively with the new nodes as they are reached. The result will be the 
identification of a tree of all the nodes that can be reached from the supply node along 
a path with strictly positive residual flow capacity. Hence this fanning-out procedure 





Figure 10.7 Procedure for finding an augmenting path for iteration 7 of the Seervada Park example. 





Figure 10.8 A minimum cut for the Seervada Park problem. 


will always identify an augmenting path if one exists. The procedure is illustrated 
in Fig. 10.7 for the residual network that results from iteration 6 in the preceding 
example. 

Although the procedure illustrated in Fig. 10.7 is a relatively straightforward 
one, it would be helpful to be able to recognize when optimality has been reached 
without an exhaustive search for a nonexistent path. It 1s sometimes possible to rec- 
ognize this event because of an important theorem of network theory known as the 
max-flow min-cut theorem. A cut may be defined as any set of directed arcs containing 
at least one arc from every directed path from the supply node to the demand node. 
The cut value is the sum of the arc capacities of the arcs (in the specified direction) 
of the cut. The max-flow min-cut theorem states that, for any network with a single 
supply node and demand node, the maximum feasible flow from the supply node to 
the demand node equals the minimum cut value for all of the cuts of the network. 
Thus, if we let F denote the amount of flow from the supply node to the demand 
node for any feasible flow pattern, the value of any cut provides an upper bound to 
F, and the smallest of the cut values is equal to the maximum value of F. Therefore, 
if a cut whose value equals the value of F currently attained by the solution procedure 
can be found in the original network, the current flow pattern must be optimal. Even- 
tually, optimality has been attained whenever there exists a cut in the residual network 
whose value is zero. 

To illustrate, consider the cut in the network of Fig. 10.5 that is indicated in 
Fig. 10.8. Notice that the value of the cut is (3 + 4 + 1 + 6) = 14, which was 
found to be the maximum value of F, so this cut is a minimum cut. Notice also that, 
in the residual network resulting from iteration 7, where F = 14, the corresponding 
cut has a value of zero. If this had been noticed, it would not have been necessary to 
search for additional augmenting paths. 


10.6 The Minimum Cost Flow Problem 


The minimum cost flow problem holds a central position among network optimization 
models, both because it encompasses such a broad class of applications and because 
it can be solved extremely efficiently. Like the maximum flow problem, it considers 
flow through a network with limited arc capacities. Like the shortest path problem, 
it considers a cost (or distance) for flow through an arc. Like the transportation 
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problem or assignment problem of Chap. 7, it can consider multiple sources (supply 
nodes) and multiple destinations (demand nodes) for the flow, again with associated 
costs. Like the transshipment problem of Chap. 7, it also can consider various junction 
points (transshipment nodes) between the sources and destinations for this flow. In 
fact, all five of these previously studied problems are special cases of the minimum 
cost flow problem, as we will demonstrate shortly. 

The reason that the minimum cost flow problem can be solved so efficiently is 
that it can be formulated as a linear programming problem, so it can be solved by a 
streamlined version of the simplex method called the network simplex method. We 
describe this algorithm in the next section. 


Formulation 


Consider a directed and connected network, where the n nodes include at least one 
supply node and at least one demand node. The decision variables are 


x = flow through arc i — j, 


and the given information. includes 


c = cost per unit flow through arc i — j, 
u; = arc capacity for arc i — j, 
b; = net flow. generated at node i. 
The value of b; depends on the nature of node i, where 
b; > 0, if node 7 is a supply node, 
b; < 0, if node 7 is a demand node, 
b; = 0, if node i is a transshipment node. 


The objective is to minimize the total cost of sending the available supply through 
the network to satisfy the given demand. 

By using the convention that summations are taken only over existing arcs, the 
linear programming formulation of this problem is 


n 
Minimize Z= X cyx 


ij? 
i=l j=l 


n n 
subject to D Xj — 5 X; = ba for each node i, 
j=l j=l 
and 0s Xij S Uys for each arc i —> j. 


The first summation in the node constraints represents the total flow out of node i, 
whereas the second summation represents the total flow into node i, so the difference 
is the net flow generated at this node. 

In some applications, it is necessary to have a lower bound L; > 0 for the flow 
through each arc i —> j. When this occurs, use a translation of variables, xij = Xj 
Ly, with (x; j + L) substituted for Xij throughout the model, in order to convert the 
model back into the above format with nonnegativity constraints. 

It is not guaranteed that the problem actually will possess feasible solutions, 
depending partially upon which arcs are present in the network and their arc capacities. 


However, for a reasonably designed network, the main condition needed is the fol- 
lowing. 


Feasible solutions property: A necessary condition for a minimum cost flow 
problem to have any feasible solutions is that 


i.e., the total flow being generated at the supply nodes equals the total flow 
being absorbed at the demand nodes. 


If the values of the b; provided for some application violate this condition, the usual 
interpretation is that either the supplies or the demands (whichever is in excess) 
actually represent upper bounds rather than exact amounts. When this situation arose 
for the transportation problem in Sec. 7.1, either a dummy destination was added to 
receive the excess supply or a dummy source was added to send the excess demand. 
The analogous step now is that either a dummy demand node should be added to 
absorb the excess supply (with c; = 0 arcs added from every supply node to this 
node) or a dummy supply node should be added to generate the flow for the excess 
demand (with c; = 0 arcs added from this node to every demand node). 

For many applications, the b; and u, quantities will have integer values, and 
implementation will require that the flow quantities (the x,;) also be integer. Fortu- 
nately, just as for the transportation problem, this outcome is guaranteed without 
explicitly imposing integer constraints on the variables because of the following prop- 
erty. 


Integer solutions property: For minimum cost flow problems where every b; 
and u; has an integer value, all the basic variables in every basic feasible solution 
(including an optimal one) also have integer values. 


An example of a minimum cost flow problem is shown in Fig. 10.9. This 


network is the same as in Fig. 10.2 except that now values of the b;, c,, and u; have 


c, = [50] [-30] 





2 






(uag = 10) 


(cg = 80) 


[40] [—60] 


Figure 10.9 Example of a minimum cost flow problem. 
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been added. The b; values are shown in square brackets by the nodes, so the supply 
nodes (b; > 0) are A and B, the demand nodes (b; < 0) are D and E, and the one 
transshipment node (b; = 0) is C. The c;; values are shown next to the arcs. In this 
example, all but two of the arcs have are capacities exceeding the total flow generated 
(90), so u; = œ for all practical purposes. The two exceptions are arc A —> B, where 
uag = 10, and arc C — E, which has uc, = 80. 

The linear programming model for this example is 


Minimize Z = 2x4g + 4xac + Wan + 3Xpe + Xer + 3Xpe + 2Xep, 


subject to Xap + Xac + Xap = 50 
— Xar + xgc = 40 
= Xac ~ Xge + Xce x 0 
— Xap + Xpg — Xep = —30 
— Xce — Xpp + Xep = —60 
and Xag = 10, Xce = 80, and all x; = 0. 


Now note the pattern of coefficients for each variable in the set of five link constraints. 
Each variable has exactly two nonzero coefficients, where one is +1 and the other is 
—1. This pattern recurs in every minimum cost flow problem, and it is this special 
structure that leads to the integer solutions property. 

Another implication of this special structure is that (any) one of the link con- 
straints is redundant. The reason is that summing all these constraint equations yields 
nothing but zeroes on both sides (assuming feasible solutions exist, so the b; sum to 
zero), so the negative of any one of these equations equals the sum of the rest of the 
equations. With just (n.— 1) nonredundant link constraints, these equations provide 
just (n — 1) basic variables for a basic feasible solution. In the next section, you will 
see that the network simplex method treats the x; = u; constraints as mirror images 
of the nonnegativity constraints, so the total number of basic variables is (n — 1). 
This leads to a direct correspondence between the (n — 1) arcs of a spanning tree 
and the (n — 1) basic variables—but more about that story later. 

We shall soon solve this example by the network simplex method. However, 
let us first see how the five special cases mentioned earlier fit into the network format 
of the minimum cost flow problem. For each case. we shall show how to formulate 
its prototype example in this more general way. 


Special Cases 


THE TRANSPORTATION PROBLEM: To formulate the transportation problem pre- 
sented in Sec. 7.1 as a minimum cost flow problem, a supply node is provided for 
each source, as well as a demand node for each destination, but no transshipment 
nodes are included in the network. All of the arcs are directed from a supply node to 
a demand node, where distributing x,; units from source 7 to destination j corresponds 
to a flow of x; through arc i — j. The cost c; per unit distributed becomes the cost 
cy per unit of flow. Since the transportation problem does not impose upper bound 
constraints on individual x;,, all of the u, = ©. 

Using this formulation for the P & T Co. transportation problem presented in 
Table 7.2 yields the network shown in Fig. 10.10. 





Figure 10.10 Formulation of the P & T Co. transportation problem as a minimum cost flow problem. 


THE TRANSSHIPMENT PROBLEM: Recall that the transshipment problem presented 
in Sec. 7.3 is the generalization of the transportation problem where units being 
distributed from a source to a destination can first pass through intermediate points, 
which can be either transshipment points or other sources and destinations. Therefore, 
the formulation of the transshipment problem as a minimum cost flow problem is the 
same as for the transportation problem except that now a transshipment node is pro- 
vided for each transshipment point and arcs are added for each feasible intermediate 
trip from one point (source, transshipment point, or destination) to another. 

With these additions, this formulation actually includes all of the general features 
of the minimum cost flow problem except for not having (finite) arc capacities. For 
this reason, the minimum cost flow problem sometimes is called the capacitated 
transshipment problem. 

Using this formulation for the P & T Co. transshipment problem presented in 
Table 7.24 yields the network shown in Fig. 10.11. Because every arc has a companion 
arc in the opposite direction between the same pair of nodes, we have simplified the 
depiction of this large network by using a single link with arrowheads at both ends 
to represent the pair of arcs. We also have deleted the values of the c,,, but each one 
is shown in Table 7.24. 


ip? 


THE ASSIGNMENT PROBLEM: Since the assignment problem discussed in Sec. 7.4 
is a special type of transportation problem, its formulation as a minimum cost flow 
problem fits into the format already demonstrated in Fig. 10.10. The additional spe- 
cializations are that (1) the number of supply nodes equals the number of demand 
nodes, (2) b; = 1 for each supply node, and (3) b; = — 1 for each demand node. 

Figure 10.12 shows this formulation for the Job Shop Co. assignment problem 
presented in Table 7.27. 


THE SHORTEST PATH PROBLEM: Now consider the main version of the shortest 
path problem presented in Sec. 10.3 (finding the shortest path from one origin to one 
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Figure 10.11 Formulation of the P & T Co. transshipment problem as a minimum cost flow problem. 


destination through an undirected network). To formulate this problem as a minimum 
cost flow problem, one supply node with a supply of 1 is provided for the origin, one 
demand node with a demand of 1 is provided for the destination, and the rest of the 
nodes are transshipment nodes. Because the network for our shortest path problem is 
undirected, whereas the minimum cost flow problem is assumed to have a directed 
network, we replace each link by a pair of directed arcs in opposite directions (depicted 
by a single line with arrowheads at both ends). The only exceptions are that there is 
no need to bother with arcs into the supply node or out of the demand node. The 
distance between nodes i and j becomes the unit cost c; or c; for flow in either 
direction between these nodes. As with the preceding special cases, no arc capacities 
are imposed, so all Uj = %. 

Figure 10.13 depicts this formulation for the Seervada Park shortest path prob- 
lem shown in Fig. 10.1, where the numbers next to the lines now represent the unit 
cost of flow in either direction. 


THE MAXIMUM FLOW PROBLEM: The last special case we shall consider is the 
maximum flow problem described in Sec. 10.5. In this: case a network already is 
provided with one supply node, one demand node, and various transshipment nodes, 
as well as the various arcs and arc capacities. Only three adjustments are needed to 
fit this problem into the format for the minimum cost flow problem. One is to set 
cy = 0 for all existing arcs to reflect the absence of costs in the maximum flow 
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Figure 10.12 Formulation of the Job Shop Co. assignment problem as a minimum cost flow problem. 





[0] [0] 


Figure 10.13 Formulation of the Seervada Park shortest path problem as a minimum cost flow problem. 
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[0] [0] 





Cor = M (tor = *) 
Figure 10.14 Formulation of the Seervada Park maximum flow problem as a minimum cost flow 
problem. 


problem. A second is to select a quantity F, which is a safe upper bound on the 
maximum feasible flow through the network, and then to assign a supply and a demand 
of F to the supply node and the demand node, respectively. The third is to add an 
arc going directly from the supply node to the demand node and to assign it an 
arbitrarily large unit cost of c = M as well as an unlimited are capacity (u; = %). 
Because of this huge cost, the minimum cost flow problem will send the maximum 
feasible flow through the other arcs, which achieves the objective of the maximum 
flow problem. 

Applying this formulation to the Seervada Park maximum flow problem shown 
in Fig. 10.5 yields the network given in Fig. 10.14. 


FINAL COMMENTS: When each of these five problems was first presented, we 
described (or at least referenced) a special-purpose algorithm for solving it very ef- 
ficiently. Therefore, it certainly is not necessary to reformulate these special cases to 
fit the format of the minimum cost flow problem in order to solve them. However. 
when a computer code is not readily available for the special-purpose algorithm, it is 
very reasonable to use the network simplex method instead. In fact, recent imple- 
mentations of the network simplex method have become so powerful that it now 
provides an excellent alternative to the special-purpose algorithm. This is especially 
true for the (uncapacitated) transshipment problem and, to some extent, the transpor- 
tation problem. 

The fact that these five problems are special cases of the minimum cost flow 
problem is of interest for other reasons as well. One is that the underlying theory for 
the minimum cost flow problem and the network simplex method provides a unifying 
theory for all of these special cases. Another is that some of the many applications 
of the minimum cost flow problem include features of one or more of the special cases 
within them, so it is important to know how to reformulate these features into the 
broader framework of the general problem. 


10.7 The Network Simplex Method 





The network simplex method is a highly streamlined version of the simplex method 
for solving minimum cost flow problems. As such, it goes through the same basic 
steps at each iteration—finding the entering basic variable, determining the leaving 
basic variable, and solving for the new basic feasible solution—in order to move from 
the current basic feasible solution to a better adjacent one. However, it executes these 
steps in ways that exploit the special network structure of the problem without ever 
needing a simplex tableau. 

You may note some similarities between the network simplex method and the 
transportation simplex method presented in Sec. 7.2. In fact, both are streamlined 
versions of the simplex method that provide alternative algorithms for solving trans- 
portation problems in similar ways. The network simplex method extends these ideas 
to solving other types of minimum cost flow problems as well. 

In this section, we provide a somewhat abbreviated description of the network 
simplex method that focuses just on the main concepts. We omit certain details needed 
for a full computer implementation, including how to construct an initial basic feasible 
solution, or how to perform certain calculations (such as for finding the entering basic 
variable) in the most efficient manner. These details are provided in various more 
specialized textbooks, such as Selected References 1, 2, 6, 7, 11, 12, and 14. 


Incorporating the Upper Bound Technique 


The first concept is to incorporate the upper bound ee described in Sec. 9.1 to 
deal efficiently with the arc capacity constraints, x; = u;;. Thus, rather than treating 
these constraints as functional constraints, they are tanaled just like nonnegativity 
constraints. Therefore, they are considered only when determining the leaving basic 
variable. In particular, as the entering basic variable is increased from zero, the leaving 


basic variable is the first basic variable that reaches either its lower bound (0) or its 


upper bound (u;;). A nonbasic variable at its upper bound, x; = u,;, is replaced by 
Xj = Uj — Yj SO yy = 0 becomes the nonbasic variable. See Sec. 9.1 for further 
details. 


In our current context, y; has an interesting network interpretation. Whenever 
y; becomes a basic variable with a strictly positive value (= w;;), this value can be 
thought of as flow from node j to node / (so in the ‘‘wrong’’ direction through arc 
i — j) that, in actuality, is cancelling that amount of the previously assigned flow 
(x, = u;j) from node i to node j. Thus, when x; = u; is replaced by x; = uj — yy. 
we also replace the real arc i — j by the reverse arc j — i, where this new arc has 
arc capacity u, (the maximum amount of the x, = u; flow that can be cancelled) and 
unit cost — i (Since each unit of flow ce saves Cy) To reflect the flow of 
Xj = Uy siren the deleted arc, we shift this amount of net flow generated from 
node i to node j by decreasing b; by uj; and increasing b; by u;. Later, if y, becomes 
the leaving basic variable by reaching its upper bound, y; = u; is replaced by y; = 
uj; — xX; with x, = 0 as the new nonbasic variable, so the above process would be 
reversed (replace arc j — i by arc i — j, etc.) back to the original configuration. 

To illustrate this process, consider the minimum cost flow problem shown in 


Fig. 10.9. While the network simplex method is generating a sequence of basic 
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[40} [-30] 






(uga = 10) 


[50] [—60] 


Figure 10.15 The adjusted network for the example when the upper bound technique leads to replacing 
Xap = 10 by x4, = 10 — Yas- 


feasible solutions, suppose that x4, has become the leaving. basic variable for some 
iteration by reaching its upper bound of 10. Consequently, x4, = 10 is replaced by 
xag = 10 — Yap, SO Yag = 0 becomes the new nonbasic variable. At the same time, 
we replace arc A — B by arc B — A (with y,, as its flow quantity), and assign this 
new arc a capacity of 10 and a unit cost of —2. To take x4, = 10 into account, we 
also decrease b, from 50 to 40 and increase by from 40 to 50. The resulting adjusted 
network is shown in Fig. 10.15. 

We shall soon illustrate the entire network simplex. method with this same ex- 
ample, starting with ys, = 0 (x4, = 10) as a nonbasic variable and so using Fig. 
10.15. A later iteration will show xc, reaching its upper bound of 80 and so being 
replaced by xcg = 80 — yop, etc., and then the next iteration has y,, reaching its 
upper bound of 10. You will see that all of these operations are performed directly | 
on the network, so we won’t need to use the Xj OF Vij labels for arc flows or even 
keep track of which arcs are real arcs and which are reverse arcs (except when 
recording the final solution). 

Using the upper bound technique leaves the node constraints (flow out minus 
flow in = b;) as the only functional constraints. Minimum cost flow problems tend 
to have far more arcs than nodes, so the resulting number of functional constraints 
generally is only a small fraction of what it would have been if the arc capacity 
constraints had been included. The computation time for the simplex method goes up 
relatively rapidly with the number of functional constraints but only slowly with the 
number of variables (or the number of bounding constraints on these variables). There- 
fore, incorporating the upper bound technique here tends to provide a tremendous 
saving in computation time. 

However, this technique is not needed for uncapacitated minimum cost flow 
problems (including the first four special cases considered in the preceding section), 
where there are no arc capacity constraints. 


Correspondence between Basic Feasible Solutions 
and Feasible Spanning Trees 


The most. important concept underlying the network simplex method is its network 
representation of basic feasible solutions. Recall from Sec. 10.6 that with n nodes, 
every basic feasible solution has (n — 1) basic variables, where each basic variable 


X; represents the flow through arc i — j. These (n — 1) arcs are referred to as basic 
ares. (Similarly, the arcs corresponding to the nonbasic variables, x; = 0 or yj = 
0, are called nonbasic arcs.) 

A key property of basic arcs is that they never form undirected cycles. (This 
property prevents the resulting solution from being a weighted average of another pair 
of feasible solutions, which would violate one of the general properties of basic 
feasible solutions.) However, any set of (n — 1) arcs that contains no undirected 
cycles forms a spanning tree. Therefore, any set of basic arcs forms a spanning tree. 

Thus basic feasible solutions can be obtained by ‘‘solving’’ spanning trees, as 
summarized below. 


A spanning tree solution is obtained as follows: 


1. For the arcs not in the spanning tree (the nonbasic arcs), set the corresponding 
variables (x; or y) equal to zero. 

2. For the arcs that are in the spanning tree (the basic arcs), solve for the corresponding 
variables (x; or y,;) in the system of linear equations provided by the node 
constraints. 


(The network simplex method actually solves for the new basic feasible solution from 
the current one much more efficiently, without resolving this system of equations from 
scratch.) Note that this solution process does not consider either the nonnegativity 
constraints or the arc capacity constraints for the basic variables, so the resulting 
spanning tree solution may or may not be feasible with respect to these constraints — 
which leads to our next definition. 


A feasible spanning tree is a spanning tree whose solution from the node constraints 
also satisfies all the other constraints (0 = x, = uy or 0 = yy = uy). 





With these definitions, we now can summarize our key conclusion as follows: 


The fundamental theorem for the network simplex method: Basic solutions 
are spanning tree solutions (and conversely). Basic feasible solutions are solu- 
tions for feasible spanning trees (and conversely). 


To begin illustrating the application of this fundamental theorem, consider the 
network shown in Fig. 10.15 that results from replacing x4, = 10 by x4, = 10 — 
Yag for our example in Fig. 10.9. One spanning tree for this network is the one shown 
in Fig. 10.3e, where the arcs are A —> D, D —> E, C — E, and B — C. With these 
as the basic arcs, the process of finding the spanning tree solution is shown below. 
On the left is the set of node constraints given in Sec. 10.6 after substituting (10 — 
Yap) for X43, where the basic variables are shown in boldface. On the right, starting 
at the top and moving down, is the sequence of steps for setting or calculating the 
values of the variables. 


(D) Yap = 9, xac = 9, Xen = 0 





— yag + Xac + Xap = 40 Xap = 40. 
Yar + Xgc = 50 Xac = 50. 
= xac — Xpco + XR = 0 so xog = 50. 

— Xan + Xpg — Xep = —30 so xpg= 10. 


— Xcg — Xpg + Xep = —60 Redundant. 
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[40] { — 30] 


(4) (Xag = 40) 








[50] [—60] 


Figure 10.16 The initial feasible spanning tree and its solution for the example. 


Since the values of all these basic variables satisfy the nonnegativity constraints and 
the one relevant arc capacity constraint (xep = 80), the spanning tree is a feasible 
spanning tree, so we have a basic feasible solution. 

We shall use this solution as the initial basic feasible solution for demonstrating 
the network simplex method. Figure 10.16 shows its network representation, namely, 
the feasible spanning tree and its solution. Thus the numbers given next to the arcs 
now represent flows (values of the x;) rather than the unit costs c; previously given. 
(To help you distinguish, we shall always put parentheses around flows but not around 
costs.) 


Selecting the Entering Basic Variable 


To begin an iteration of the network simplex method, recall that the standard simplex 
method criterion for selecting the entering basic variable is to choose the nonbasic 
variable that, when increased from zero, will improve Z at the fastest rate. Now let 
us see how this is done without having a simplex tableau. 

To illustrate, consider the nonbasic variable x,- in our initial basic feasible 
solution, i.e., the nonbasic arc A — C. Increasing xc from zero to some value 0 
means that the arc A — C with flow 0 must be added to the network shown in Fig. 
10.16. Adding a nonbasic arc to a spanning tree always creates a unique undirected 
cycle, where the cycle in this case is seen in Fig. 10.17 to be AC-CE-DE—AD. Figure 


[40] [— 30] 









(50 + 8) 


[50] {—60] 
Figure 10.17 The effect on flows of adding arc A — C with flow 6 to the initial feasible spanning tree. 


[40] [-30] 








[50] [-60] 
Figure 10.18 The incremental effect on costs of adding arc A —> C with flow @ to the initial feasible 
spanning tree. 


10.17 also shows the effect of adding the flow 0 to arc A — C on the other flows in 
the network. Specifically, the flow is thereby increased by @ for other arcs that have 
the same direction as A — C in the cycle (arc C — E), whereas the net flow is 
decreased by @ for other arcs whose direction is opposite to A — C in the cycle (arcs 
D — E and A —> D). In the latter case, the new flow is, in effect, cancelling a flow 
of 0 in the opposite direction. Arcs not in the cycle (B —> C) are unaffected by the 
new flow. (Check these conclusions by noting the effect of the change in x4c on the 
values of the other variables in the solution just derived for the initial feasible spanning 
tree.) 

Now what is the incremental effect on Z (total flow cost) from adding the flow 
6 to arc A — C, etc.? Figure 10.18 shows most of the answer by giving the unit cost 
times the change in the flow for each arc of Fig. 10.17. Therefore, the overall incre- 
ment in Z is 


AZ 


II 


Cach + Cog + Cpg(—- 0) + cap — 9) 


40 + 10 — 30 — 90 
= —78. 


Setting 0 = 1 then gives the rate of change of Z as x,c is increased, namely, 
AZ = —7, when 0= 1. 


Because the objective is to minimize Z, this large rate of decrease in Z by increasing 
xac is very desirable, so x,- becomes a prime candidate to be the entering basic 
variable. 

We now need to perform the same analysis for the other nonbasic variables 
before making the final selection of the entering basic variable. The only other 
nonbasic variables are y,, and Xgp, corresponding to the two other nonbasic arcs, 
B — A and E —D, in Fig. 10.15. 

Figure 10.19 shows the incremental effect on costs of adding arc B —> A with 
flow @ to the initial feasible spanning tree given in Fig. 10.16. Adding this arc creates 
the undirected cycle BA-AD-DE-CE-BC, so the flow increases by 0 for arcs A —> D 
and D — E, but decreases by @ for the two arcs in the opposite direction on this 
cycle, C —> E and B —> C. These flow increments, 6 and — 0, are the multiplicands 
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[40] [~30] 








[50] {— 60] 
Figure 10.19 The incremental effect on costs of adding arc B —> A with flow @ to the initial feasible 
spanning tree. 


for the c; values in the figure. Therefore, 
AZ = —20 + 90 + 30 + 1(- 0) + 3(-86) = 60 
= 6, when 0 = 1. 


The fact that Z increases rather than decreases when y,, (flow through the reverse arc 
B — A) is increased from zero rules out this variable as a candidate to be the entering 
basic variable. (Remember that increasing y4g from zero really means decreasing x,,, 
flow through the real arc A — B, from its upper bound of 10.) 

A similar result is obtained for the last nonbasic arc E —> D. Adding this arc 
with flow @ to the initial feasible spanning tree creates the undirected cycle ED-DE 
shown in Fig. 10.20, so the flow also increases by 0 for arc D — E, but no other 
arcs are affected. Therefore, 


AZ = 26+ 30 = 50 
eae. when 0 = 1, 


SO Xgp is ruled out as a candidate to be the entering basic variable. 
To summarize, 


=T, if Ax4c = 1 


5, if Atgy = l 
[40] [-30] 








[50] [—60] 


Figure 10.20 The incremental effect on costs of adding arc E —> D with flow @ to the initial feasible 
spanning tree. 


so the negative value for x4ç implies that x, becomes the entering basic variable for 
the first iteration. If there had been more than one nonbasic variable with a negative 
value of AZ, the one having the largest absolute value would have been chosen. (If 
there had been no nonbasic variables with a negative value of AZ, the current basic 
feasible solution would have been optimal.) 

Rather than identifying undirected cycles, etc., the network simplex method 
actually obtains these AZ values by an algebraic procedure that is considerably more 
efficient (especially for large networks). The procedure is analogous to that used by 
the transportation simplex method (see Sec. 7.2) to solve for the u; and v, in order to 
obtain the value of (c; — u; — v;) for each nonbasic variable x;. We shall not describe 
this procedure further, so you should just use the undirected cycles method when 
doing problems at the end of the chapter. 


Finding the Leaving Basic Variable and the Next Basic Feasible Solution 


After selecting the entering basic variable, only one more quick step is needed to 
simultaneously determine the leaving basic variable and solve for the next basic fea- 
sible solution. For the first iteration of the example, the key is Fig. 10.17. Since x4. 
is the entering basic variable, the flow 0 through arc A — C is to be increased from 
zero as far as possible until one of the basic variables reaches either its lower bound 
(0) or its upper bound (w;;). For those arcs whose flow increases with 0 in Fig. 10.17 
(arcs A — C and C —> E), only the upper bounds (uac = © and uc, = 80) need to 
be considered: 


Xac 7 Gsm, 


Xcpe = 50 + 0 = 80, so 6 = 30. 


For those arcs whose flow decreases with @ (arcs D — E and A —> D), only the lower 
bound of 0 needs to be considered: 


Xpe = 10 — 620, so # = 10. 
Xap = 40 - 020, so 6 = 40. 


Arcs whose flow is unchanged by @ (i.e., those not part of the undirected cycle), 
which is just arc B —> C in Fig. 10.17, can be ignored since no bound will be reached 
as @ is increased. 

For the five arcs in Fig. 10.17, the conclusion is that xp, must be the leaving 
basic variable because it reaches a bound for the smallest value of 6 (10). Setting 
0 = 10 in this figure thereby yields the flows through the basic arcs in the next basic 
feasible solution: 


xac = 0 = 10, 

Hos 50s GO: 
SAO 280; 
Xgc = 50. 


The corresponding feasible spanning tree is shown in Fig. 10.21. 
If the leaving basic variable had reached its upper bound, then the adjustments 
discussed for the upper bound technique would have been needed at this point (as 
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[40] [—30] 








[50] [—60] 


Figure 10.21 The second feasible spanning tree and its solution for the example. 


you will see illustrated during the. next two. iterations). However, because it was the 
lower bound of 0 that was reached, nothing more needs to be done. 


COMPLETING THE EXAMPLE: For the two remaining iterations needed to reach the 
optimal solution, the primary focus will be on some features of the upper bound 
technique they illustrate. The pattern for finding the entering basic variable, the leaving 
basic variable, and the next basic feasible solution will be very similar to that described 
for the first iteration, so we shall only summarize these steps briefly. 


Iteration 2: Starting with the feasible spanning tree shown in Fig. 10.21, and referring 
back to Fig. 10.15 for the unit costs (c,;), the calculations for selecting the entering 
basic variable are given in Table 10.3. The second column identifies the unique 
undirected cycle that is created by adding the nonbasic arc in the first column to this 
spanning tree, and the third column shows the incremental effect on costs because of 
the changes in flows on this cycle caused by adding a flow of 0 = | to the nonbasic 
arc. Arc E — D has the largest (in absolute terms) negative value of AZ, so xpp is 
the entering basic variable. 

We now make the flow @ through arc E —> D as large as possible while satisfying 
the following flow bounds: 


Xgp = 0 S Ugn = %, so 0 = o. 
Xap = 30 — 0 = 0, so 0 = 30. 
xac = 10 + OS ugc = >, so 0 So, 
xog = 60 + OS ucg = 80, so 6S 20. <— Minimum 


Because Xc, imposes the smallest upper bound (20) on 0, xcg becomes the leaving 


Table 10.3 Calculations for Selecting the Entering Basic 
Variable for Iteration 2 


Nonbasic Cycle 
_ Are Created AZ when 0 = 1 


"BA BA-AC-BC -2+4-3 
D->E | DE-CE-AC-AD |3-1-44+9 
E—>D | ED-AD-AC-CE |2-9+4+4+1 


=I 


(T a 
a 


-2 <—Minimum 





[40] [—30] 367 








(Xag = 10) (D) Network Analysis, 
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(30) [—80] 
(20) 
5 (50) 5 
[50] [20] 


Figure 10.22 The third feasible spanning tree and its solution for the example. 


basic variable. Setting 0 = 20 in the above expressions for Xgp, X4p, and xac then 
yields the flow through the basic arcs for the next basic feasible solution (with xg¢ = 
50 unaffected by 6), as shown in Fig. 10.22. 

What is of special interest here is that the leaving basic variable xc; was obtained 
by the variable reaching its upper bound (80). Therefore, by using the upper bound 
technique, xc, is replaced by 80 — ycg, where ycg = O is the new nonbasic variable. 
At the same time, the original are C —> E with cc, = 1 and ucy = 80 is replaced 
by the reverse arc E —> C with cec = —1 and upc = 80. The values of bp and be 
also are adjusted by adding 80 to b, and subtracting 80 from bc. The resulting adjusted 
network is shown in Fig. 10.23, where the nonbasic arcs are shown as dashed lines, 
and the numbers by all the arcs are unit costs. 


Iteration 3: Using Figs. 10.22 and 10.23 to initiate the next iteration, Table 10.4 
shows the calculations that lead to selecting y,, (reverse arc B —> A) as the entering 
basic variable. We then add as much flow 0 through arc B — A as possible while 
satisfying the flow bounds below: 


Yap = OF Ug, = 10, so ĝ = 10. <— Minimum 
xac = 30 + OS tye = ®, so 8S %, 
xgc = 50 - 020, so 0 = 50. 


The smallest upper bound (10) on @ is imposed by y,,, so this variable becomes the 
leaving basic variable. Setting 0 = 10 in these expressions for x4c and Xgc (along 








~ 


(uge = 80) ~ 





[20] 


Figure 10.23 The adjusted network with unit costs at the completion of iteration 2. 
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Table 10.4 Calculations for Selecting the Entering Basic 
Variable for Iteration 3 











Nonbasic Cycle 
Are Created AZ when 0 = 1 
BoA BA-AC-BC 2+4-3= -1 <—Minimum 
D-E DE-EC-AC-AD 3-1-4+9= 7 
E>C EC-AC-AD-ED -l-4+9-2= 2 





with the unchanged values of xac = 10 and xẹp = 20) then yields the next basic 
feasible solution, as shown in Fig. 10.24. 

As with iteration 2, the leaving basic variable (y,,) was obtained here by the 
variable reaching its upper bound. In addition, there are two other points of special 
interest concerning this particular choice. One is that the entering basic variable y,, 
also became the leaving basic variable on the same iteration! This event occurs oc- 
casionally with the upper bound technique whenever increasing the entering basic 
vatiable from zero causes its upper bound to be reached first before any of the other 
basic variables reach a bound. 

The other interesting point is that the arc B — A that now needs to be replaced 
by a reverse arc A — B (because of the leaving basic variable reaching an upper 
bound) already is a reverse arc! This is no problem, because the reverse arc for a 
reverse arc is simply the original real arc. Therefore, the arc B —> A (with cp, = —2 
and up, = 10) in Fig. 10.23 now is replaced by arc A — B (with cag, = 2 and 
uag = 10), which is the arc between nodes A and B in the original network shown 
in Fig. 10.9, and a generated net flow of 10 is shifted from node B (bg = 50 — 40) 
to node A (b, = 40 — 50). Simultaneously, the variable y,, = 10 is replaced by 
10 — xag, with x4, = O as the new nonbasic variable. 

The resulting adjusted network is shown in Fig. 10.25. 


Passing the Optimality Test: At this point, the algorithm would attempt to use Figs. 
10.24 and 10.25 to find the next entering basic variable with the usual calculations 
shown in Table 10.5. However, none of the nonbasic arcs gives a negative value of 
AZ, so an improvement in Z cannot be achieved by introducing flow through any of 
them. This means that the current basic feasible solution shown in Fig. 10.24 has 
passed the optimality test, so the algorithm stops. 


[50] [— 30] 
(xas = 10) (>) 


[—80] 





(20) 


[40] [20] 
Figure 10.24 The fourth (and final) feasible spanning tree and its solution for the example. 








[40] [20] 


Figure 10.25 The adjusted network with unit costs at the completion of iteration 3. 


To identify the flows through real arcs rather than reverse arcs for this optimal 
solution, the current adjusted network (Fig. 10.25) should be compared with the 
original network (Fig. 10.9). Note that each of the arcs has the same direction in the 
two networks with the one exception of the arc between nodes C and E. This means 
that the only reverse arc in Fig. 10.25 is arc E —> C, where its flow is given by the 
variable yoy. Therefore, calculate x¢g = Uce — Yer = 80 — Veg. Arc E — C happens 
to be a nonbasic arc, so ycg = 0, so xcg = 80 is the flow through the real arc C > 
E. All the other flows through real arcs are the flows given in Fig. 10.24. Therefore, 
the optimal solution is the one shown in Fig. 10.26. 


10.8 Project Planning and Control with PERT-CPM 


The successful management of large-scale projects requires careful planning, sched- 
uling, and coordinating of numerous interrelated activities. To aid in these tasks, 
formal procedures based on the use of networks and network techniques were devel- 
oped beginning in the late 1950s. The most prominent of these procedures are PERT 
(Program Evaluation and Review Technique) and CPM (Critical Path Method), al- 
though there have been many variants under different names. As you will see later, 
there are a few important differences between these two procedures. However, in 
recent years the trend has been to merge the two approaches into what is usually 
referred to as a PERT-type system. 

Although the original application of PERT-type systems was for evaluating a 
schedule for a research and development program, it is also used to measure and 
control progress on numerous other types of special projects. Examples of these project 
types include construction programs, programming of computers, preparation of bids 


Table 10.5 Calculations for the Optimality Test 
at the End of Iteration 3 












Nonbasic 
Arc 


Cycle 
Created. 


AB-BC-AC 
DE-EC-AC-AD 
EC-AC-AD-ED 






AZ when 6 = 1 
2+3-4=1 

3-1-44+9=7 

I 4+9-2=2 








D-E 
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[50] [30] 
(X42 = 10) 








[0] 





[40] [~ 60] 
Figure 10.26 The optimal flow pattern in the original network for the example. 


and proposals, maintenance planning, and the installation of computer systems. This 
kind of approach has even been applied to the production of movies, political cam- 
paigns, and complex surgery. 

A PERT-type system is designed to: aid in planning and control, so it may not 
involve much direct optimization. Sometimes one of the primary objectives is to 
determine the probability of meeting specified deadlines. It also identifies the activities 
that are most likely to be bottlenecks and, therefore, the places where the greatest 
effort should be made to stay on schedule: A third objective is to evaluate the effect 
of changes in the program. For example, it will evaluate the effect of a contemplated 
shift of resources from the less critical activities to the activities identified as probable 
bottlenecks. Other resource and performance trade-offs may also be evaluated. An- 
other important use is to evaluate the effect of deviations from schedule. 

All PERT-type systems use a project network to portray graphically the inter- 
relationships. among the elements of a project. This network representation of the 
project plan shows all the precedence relationships regarding the order in which tasks 
must be performed. This feature is illustrated by Fig. 10.27, which shows the initial 
project network for building a house. This network indicates that the excavation must 
be done before laying the foundation, and then the foundation must be completed 
before. putting up the rough wall. Once the rough wall is up, three tasks (rough 
electrical work, rough exterior plumbing, and putting up the roof) can be done in 
parallel. Tracing through the network further then spells out the ordering of subsequent 
tasks. 

In the terminology of PERT, each arc of the project network represents an 
activity that is one of the tasks required by the project. Each node represents an event 
that usually is defined as the point in time when all activities leading into that node 
are completed. The arrowheads indicate the sequences in which the events must be 
achieved. Furthermore, an event must precede the initiation of the activities leading 
out of that node. (In reality, it is often possible to overlap successive phases of a 
project, so the network may represent an approximate idealization of the project plan.) 

The node toward which all activities lead is the event that corresponds to the 
completion of the currently planned project. The network may represent either the 
plan for the project from its inception or, if the project has already begun, the plan 
for the completion of the project. In the latter case, each node without incoming arcs 
represents either the event of continuing a current activity or the event of initiating a 
new activity that may begin at any time. 
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Figure 10.27 Initial project network for constructing a house. 


Each arc plays a dual role of representing an activity and helping to show the 
precedence relationships between the various activities. Occasionally, an arc is needed 
to further define precedence relationships even when there is no real activity to be 
represented. In this case, a dummy activity requiring zero time is introduced, where 
the arc representing this fictional activity is shown as a dashed-line arrow that indicates 
a precedence relationship. To illustrate, consider arc 5 —> 8 representing a dummy 
activity in Fig. 10.27. The sole purpose of this arc is to indicate that the rough exterior 
plumbing must be completed before the exterior painting can begin. 

A common rule for constructing these project networks is that two nodes can 
be directly connected by no more than one arc. Dummy activities can also be used 
to avoid violating this rule when there are two or more concurrent activities, as 
illustrated by arc 11 —> 12 in Fig. 10.27. The sole purpose of this arc is to indicate 
that the flooring must be completed before installing the interior fixtures, without 
having two arcs from node 9 to node 12. 

After the network for a project has been developed, the next step is to estimate 
the time required for each of the activities. These estimates for the house-construction 
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example of Fig. 10.27 are shown by the darker numbers (in units of work days) next 
to the arcs in Fig. 10.28. These times are used to calculate two basic quantities for 
each event, namely, its earliest time and its latest time. 


The earliest time for an event is the (estimated) time at which the event will occur if 
the preceding activities are started as early as possible. 


The earliest times are obtained by making a forward pass through the network, starting 
with the initial events and working forward in time toward the final events. For each 
event, a calculation is made of the time at which that event will occur if each im- 
mediately preceding event occurs at its earliest time and each intervening activity 
consumes exactly its estimated time. The initiation of the project should be labeled 
as time 0. This process is shown in Table 10.6 for the example considered in Figs. 
10.27 and 10.28. The resulting earliest times are recorded in Fig. 10.28 as the first 
of the two numbers given by each node. 


The latest time for an event is the (estimated) last time at which the event can occur 
without delaying the completion of the project beyond its earliest time. 


(22, 26) 





(37, 38) 


(44, 44) 


Figure 10.28 Final project network for constructing a house. 


Table 10.6 Calculation of Earliest Times 
for House-Construction Example 




















Immediately 
Preceding Earliest | Activity Maximum 
Event Event Time Time = Earliest Time 

1 — — 0 
2 1 0+ 2 2 
3 2 2+ 4 6 
4 3 6 + 10 16 
5 4 16 4 20 
6 4 16+ 6 22 
7 4 16+ 7 25 

5 20+ 5 
8 5 20+ 0 29 

6 22+ 7 
9 7 25+ 8 33 
10 8 29+ 9 38 
11 9 33 + 4 37 
12 9 33 + 5 38 

11 37 + 0 
13 10 38 + 2 44 

12 38 + 6 


In this case, the latest times are obtained successively for the events by making a 
backward pass through the network, starting with the final events and working back- 
ward in time toward the initial events. For each event, a calculation is made of the 
final time the event can occur in order for each immediately following event to occur 
at its latest time if each intervening activity consumes exactly its estimated time. This 
process is illustrated in Table 10.7, with 44 as the earliest time and latest time for 
the completion of the house-construction project. The resulting latest times are re- 
corded in Fig. 10.28 as the second of the two numbers given by each node. 

Let activity (i, j) denote the activity going from event i to event j in the project 
network. 


The slack for an event is the difference between its latest and its earliest time. The 
slack for an activity (i, j) is the difference between [the latest time of event j] 
and [the earliest time of event 7 plus the estimated activity time]. 


Thus, assuming everything else remains on schedule, the slack for an event indicates 
how much delay in reaching the event can be tolerated without delaying the project’s 
completion, and the slack for an activity indicates the same thing regarding a delay 
in the completion of that activity. The calculation of these slacks is illustrated in Table 
10.8 for the house-construction project. 


A critical path for a project is a path through the network such that the activities on 
this path have zero slack. (All activities and events having zero slack must lie on a 
critical path, but no others can.) 


If we check the activities in Table 10.8 that have zero slack, we find that the house- 
construction example has one critical path, 1 > 2 => 3 > 4 > 5 > 7 > 9 > 
12 — 13, as shown in Fig. 10.28 by the dark arrows. Thus this sequence of critical 
activities must be kept strictly on schedule in order to avoid slippage in completing 
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Table 10.7 Calculation of Latest Times 
for House-Construction Example 


Imm 




















ediately 
Following Latest _ Activity Minimum 
Event Event Time Time = Latest Time 
13 — — | 44 
12 13 44-6 38 
11 12 38-0 38 
10 13 44-2 42 
9 12 38-5 33 
il 38-4 
8 10 42-9 33 
7 9 33-8 25 
6 8 33-7 26 
5 8 33-0 20 
7 25-5 
4 7 25-7 16 
6 26-6 
5 20-4 
3 4 16—10 6 
2 3 6-4 2 
1 2 2-2 0 


the project. Other projects may have more than one such critical path; e.g., note what 
would happen in Fig. 10.28 if the estimated time for activity (4, 6) were changed 
from 6 to 10. 

It is interesting to observe in Table 10.8 that, whereas every event on the critical 
path (including events 4 and 7). necessarily has zero slack, activity (4, 7) does not 
because its estimated time is less than the sum of the estimated times for activities 
(4, 5) and (5, 7). Consequently, the latter activities are on the critical path but activity 
(4, 7) is not. 

This information on earliest and latest times, slack, and the critical path is 
invaluable for the project manager. Among other things, it enables the manager to 


Table 10.8 Calculation of Slacks for House-Construction 
















Example 

1 (1, 2) 2-(0+ J= 
2 (2, 3) 6-(2+ 4= 
3 6- 6=0 (3,4) | 16-(6 + 10) = 
4 16 - 16 =0 (4,5) | 20-(6+ Y= 
5 20 — 20 = 0 (4,6) | 26- (6+ 6) = 
6 26-22=4 (4,7) | 2-d6+ N= 
7 25 — 25 = 0 (5,7). | 25- (0+ 5)= 
8 33 —- 29 = (6,8) | 33 - (22+ 7)=4 
9 33 - 33 = (7,9) | 33- (25+ 8)=0 
10 42 ~ 38 = 4 (8,10) | 42- (29+ 9 =4 
11 38 — 37 =1 (9,11) | 38- B83 + 4=1 
12 0 (9,12) | 38- 83+ 5)=0 
13 o | (10,13) | 44- 88+ 2=4 





(12, 13) 


investigate the effect of possible improvements in the project plan, to determine where 
special effort should be expended to stay on schedule, and to assess the impact of 
schedule slippages. 


The PERT Three-Estimate Approach 


Thus far we have implicitly assumed that reasonably accurate estimates can be made 
of the time required for each activity of the project. In actuality, there frequently is 
considerable uncertainty about what the time will be; it really is a random variable 
having some probability distribution. The original version of PERT took this uncer- 
tainty into account by using three different types of estimates of the activity time to 
obtain basic information about its probability distribution. This information for all the 
activity times is then used to estimate the probability of completing the project by the 
scheduled date. 

The three time estimates used by PERT for each activity are a most likely 
estimate, an optimistic estimate, and a pessimistic estimate. The most likely estimate 
(denoted by m) is intended to be the most realistic estimate of the time the activity 
might consume. Statistically speaking, it is an estimate of the mode (the highest point) 
of the probability distribution for the activity time. The optimistic estimate (denoted 
by a) is intended to be the unlikely but possible time if everything goes well. Statis- 
tically speaking, it is an estimate of essentially the lower bound of the probability 
distribution. The pessimistic estimate (denoted by b) is intended to be the unlikely 
but possible time if everything goes badly. Statistically speaking, it is an estimate of 
essentially the upper bound of the probability distribution. The intended location of 
these three estimates with respect to the probability distribution is shown in Fig. 10.29. 

Two assumptions are made to convert m, a, and b into estimates of the expected 
value (t,) and variance (o°) of the elapsed time required by the activity. One as- 
sumption is that o, the standard deviation (square root of the variance), equals one- 
sixth the range of reasonably possible time requirements; that is, 


o = Bb -= a)P 


is the desired estimate of the variance. The rationale for this assumption is that the 
tails of many probability distributions (such as the normal distribution) are considered 
to lie about three standard deviations from the mean, so that there is a spread of about 
six standard deviations between the tails. For example, the control charts commonly 
used for statistical quality control are constructed so that the spread between the control 
limits is estimated to be six standard deviations. 

To obtain the estimated expected value (t,), we also need an assumption about 
the form of the probability distribution. This assumption is that the distribution is (at 





Beta distribution 


0 a m b 


Elapsed time 


Figure 10.29 Model of the probability distribution of activity times for the PERT three-estimate 
approach: m = most likely estimate, a = optimistic estimate, and b = pessimistic estimate. 
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least approximately) a beta distribution. This type of distribution has the form shown 
in Fig. 10.29, which is a reasonable one for this purpose. 

If we use the model illustrated in Fig. 10.29, the expected value of the activity 
time is approximately 


t = 3[2m + Ha + D)]. 


Notice that the midrange (a + b)/2 lies midway between a and b, so that t, is the 
weighted arithmetic mean of the mode and the midrange, the mode carrying two- 
thirds of the entire weight. Although the assumption of a beta distribution. is an 
arbitrary one, it serves its purpose of locating the expected value with respect to m, 
a, and b in what seems to be a reasonable way. 

After calculating the estimated expected value and variance for each of the 
activity times, we need three additional assumptions (or approximations) to enable us 
to calculate the probability of completing the project on schedule. One is that the 
activity times are statistically independent. A second is that the critical path (in terms 
of expected times) always requires a longer total elapsed time than any other path. 
The resulting implication is that the expected value and variance of project time are 
just the sum of the expected values and variances (respectively) of the times for the 
activities on the critical path. 

The third assumption is that the project time has a normal distribution. The 
rationale for this assumption is that this time is the sum of many independent random 
variables, and the general version of the central limit theorem implies that the prob- 
ability distribution of such a sum is approximately normal under a wide range of 
conditions. Given the mean and variance, it is then straightforward (see. Table A5.1) 
to find the probability that this normal random variable (project time) will be less than 
the scheduled completion time.! 

To illustrate, suppose that the house-construction project of Fig. 10.27 is sched- 
uled to be completed after 50 working days and that both the expected value and 
variance of each activity time happen to equal the estimated time given for that activity 
in Fig. 10.28. Therefore, when we add these quantities (separately) over the critical 
path, both the expected value and variance of project time are 44, so its standard 
deviation is V44 ~ 6.63. Thus the scheduled completion time is approximately 0.9 
standard deviation above the expected project time. Table A5.1 then gives an ap- 
proximate probability of 1 — 0.1841 ~ 0.82 that this schedule will be met. 


The CPM Method of Time-Cost Trade-Offs 


The original versions of CPM and PERT differ in two important ways. First, CPM 
assumes that activity times are deterministic (i.e., they can be reliably predicted 
without significant uncertainty), so that the three-estimate approach just described is 
not needed. Second, rather than primarily emphasizing time (explicitly), CPM places 
equal emphasis on time and cost. This dual emphasis is achieved by constructing a 
time-cost curve for each activity, such as the one shown in Fig. 10.30. This curve 
plots the relationship between the budgeted direct cost? for the activity and its resulting 


1 The same procedure can also be used to find the probability that an intermediate event will be accomplished 
before a scheduled time. 

? Direct cost includes the cost of the material, equipment, and direct labor required to perform the activity 
but excludes indirect project costs such as supervision and other customary overhead costs, interest charges, 
and so forth. 


Crash cost Ca; 


Activity direct cost 


Normal cost Cp; 





d., Di x 
Crash time Normal time 


Activity duration time 
Figure 10.30 Time-cost curve for activity (i, j). 


duration time. The plot normally is based on two points!: the normal and the crash 
points. The normal point gives the cost and time involved when the activity is 
performed in the normal way without any extra costs (overtime labor, special time- 
saving materials or equipment, etc.) being expended to speed up the activity. By 
contrast, the crash point gives the time and cost involved when the activity is per- 
formed on a crash basis; i.e., it is fully expedited with no cost spared to reduce the 
duration time as much as possible. As an approximation, it is then assumed that all 
intermediate time-cost trade-offs also are possible and that they lie on the line segment 
between these two points (see the solid line segment shown in Fig. 10.30). Thus the 
only estimates that need to be obtained from the project personnel for this activity are 
the cost and time for the two points. 

The basic objective of CPM is to determine just which time-cost trade-off should 
be used for each activity to meet the scheduled project completion time at a minimum 
cost. One way of determining the optimal combination of time-cost trade-offs for all 
the activities is to use linear programming. To describe this approach, we need to 
introduce considerable notation, some of which is summarized in Fig. 10.30. Let 


D; = normal time for activity (i, j). 


= 
Il 


normal (direct) cost for activity (i, j). 


d; = crash time for activity (i, j). 


= 
il 


crash (direct) cost for activity (i, j). 


The decision variables for the problem are the x;, where 


ip? 


x; = duration time for activity (i, j). 


' More than two points can be used under certain circumstances. 
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Thus there is one decision variable x; for each activity, but there is none for those 
values of i and j that do not have a corresponding activity. 

To express the direct cost for activity (i, J) as a (linear) function of Xij, 
the slope of the line through the normal and crash points for activity (i, j) by 


denote 


5. = Coy ~ Cay Cay 

4 D; — dj 

Also define K;; as the intercept with the direct cost axis of this line, as shown in Fig. 
10.30. Therefore, 


Direct cost for activity (i, j) = K; + SijXijs 


Consequently, 


Total direct cost for the project = > (Ky + Six), 
GD 
where the summation is over all activities (i, j). We are now ready to state and 
formulate the problem mathematically. 


The problem: For a given (maximum) project completion time T, choose the 
x, to minimize total direct cost for the project. 


LINEAR PROGRAMMING FORMULATION: To take the project completion time into 
account, we need one more variable for each event in the linear programming for- 
mulation of the problem. This additional variable is 


Yy = (unknown) earliest time for event k, which is a deterministic 
function of the x;;. 


Each y, is an auxiliary variable, i.e., a variable that is introduced into the model as 
a convenience in the formulation rather than representing a decision. However, the 
simplex method treats auxiliary variables just like the regular decision variables 
(the x). 

To illustrate how the y, are worked into the formulation, consider event 7 in 
Fig. 10.27. By definition, its earliest time is 


y; = max{yy + X47, Ys + Xsqh- 


In other words, y; is the smallest quantity such that both of the following constraints 
hold: 


Ya + X47 = Vy 
Ys + X57 = Yy 


so that these two constraints can be incorporated directly into the linear programming 
formulation (after bringing y, to the left-hand side for proper form). Furthermore, we 
shall soon describe why the optimal solution obtained by the simplex method for the 
overall model automatically will have y- at the smallest quantity that satisfies these 
constraints, so no further constraints are needed to incorporate the definition of y- into 
the model. 

In the process of adding these constraints for all the events, every variable x, 


will appear in exactly one constraint of this type, 
Yi t Xj E Yj 
which then is expressed in proper form as 
yit xj Y= 0. 


To continue the preparations for writing down the complete linear programming 
model, label 


Event 1 = project start 
Event n = project completion, 


| 


so y, = 0 
y, = (unknown) project completion time. 


Also note that £, K; is just a fixed constant that can be dropped from the objective 
function, so that minimizing total direct cost for the project is equivalent (see Sec. 
4.6) to maximizing È (—S,;)x;. Therefore, the linear programming problem is to find 
the x;; (and the corresponding y,) that 
Maximize Z= S (=S; xy» 
6J) 
subject to Xy = di 
x; = D; ¢ for all activities (i,j) 
Y + xy ¥; SO 


Yn ST. 


From a computational viewpoint, this formulation can be improved somewhat 
by replacing each x; by 


oars , 
xy = d; + Xij 


throughout the model, so that the first set of functional constraints (x, = d;;) would 
be replaced by simple nonnegativity constraints 


A 
x; = 0. 


As a convenience we can also introduce nonnegativity constraints for the other vari- 
ables, 


y, = 0, 


although these variables already are forced to be nonnegative by setting y, = 0 because 
of the x;; = 0 and y; = y; + d; + x; constraints. 

One interesting property of an optimal solution for this model is that (under 
ordinary circumstances) every path through the network will bẹ a critical path requiring 
a time of T. The reason is that such a solution satisfies the y, < T constraint while 
avoiding the extra cost involved in shortening the time for any path. 

The key to this formulation is the way that the y, are introduced into the model 
through the y; + x; — y; S 0 constraints in order to provide earliest times for the 
respective events (given the values of the x, in the current basic feasible solution). 
Since earliest times must be obtained sequentially, all of these y, are needed for the 
sole purpose of ultimately obtaining the correct value of y, (for the current values of 
the x;), thereby enabling the y, = T constraint to be enforced. However, obtaining 
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Figure 10.31 Time-cost curve for the overall project. 


the correct value does require that the value of each y, (including y,,) must be the 
smallest quantity that satisfies all the y; + x; = y; constraints. Now let us briefly 
describe why (under ordinary circumstances) this property holds for an optimal 
solution. 

Consider any solution for the x, variables such that every path through the 
network is a critical path requiring a time of T. If the values of the y, variables satisfy 
the above property, then the y, are true earliest times with y, = T exactly, and the 
overall solution for the x; and y; satisfy all the constraints. However, if any particular 
y; is made a little larger, this would create a chain reaction whereby some y; would 
need to be made a little larger to still satisfy the y; + x; = y; constraints, etc., until 
ultimately y, must be made a little larger, thereby violating the y, = T constraint. 
The only way to avoid violating this constraint with the larger y, is to make the duration 
times for some activities (subsequent to event i) a little smaller, thereby increasing 
the cost. Therefore, an optimal solution will avoid making any y, larger than need be 
to satisfy the y; + x; = y; constraints. 

The problem as stated here assumes that a specified deadline T has been fixed 
(perhaps by contract) for the completion of the project. In fact, some projects do not 
have such a deadline, in which case it is not clear what value should be assigned to 
T in the linear programming formulation. In such situations, the decision on T actually 
is a question of what is the best trade-off between the total cost and the total time for 
the project. 

The basic information we need to address this question is how the minimum 
total direct cost changes as T is changed in the preceding formulation, as illustrated 
in Fig. 10.31. This information can be obtained by using parametric linear program- 
ming (see Secs. 4.7, 6.7, and 9.3) to solve for the optimal solution as a function of 
T over its entire range. Even more efficient procedures that exploit the special struc- 
ture of the problem also are available for obtaining this information.” 


1 The slope of the time-cost curve changes at the points shown in Fig: 10.31 because the set of basic 
variables that give the optimal solution changes at these values of T. This fact is discussed further in a 
more general context in Sec. 9.3. 


2 See Selected Reference 13 for further information. 
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Figure 10.32 Minimum cost curves for the overall project. 


Figure 10.31 provides a useful basis for a managerial decision on T (and the 
corresponding optimal solution for the x,;) when the important effects of the project 
duration (other than direct costs) are largely intangible. However, when these other 
effects are primarily financial (indirect costs), it is appropriate to combine the mini- 
mum total direct cost curve of Fig. 10.31 with a curve of minimum total indirect cost 
(supervision, facilities, clerical, interest, contractual penalties) versus t, as shown in 
Fig. 10.32. The sum of these curves thereby gives the minimum total project cost 
curve for the various values of T. The optimal value of T is then the one that minimizes 
this total cost curve. 


Choosing between PERT and CPM 


The choice between the PERT three-estimate approach and the CPM method of time- 
cost trade-offs depends primarily upon the type of project and the managerial objec- 
tives. PERT is particularly appropriate when there is considerable uncertainty in 
predicting activity times and when it is important to effectively control the project 
schedule; for example, most research and development projects fall into this category. 
On the other hand, CPM is particularly appropriate when activity times can be pre- 
dicted well (perhaps based on previous experience) but these times can be adjusted 
readily (e.g., by changing crew sizes), and when it is important to plan an appropriate 
trade-off between project time and cost. This latter type is typified by most construc- 
tion and maintenance projects. 

Actually, differences between current versions of PERT and CPM are not nec- 
essarily as pronounced as we have described them. Most versions of PERT now allow 
using only a single estimate (the most likely estimate) of each activity time and thus 
omit the probabilistic investigation. A version called PERT/Cost also considers time- 
cost trade-offs in a manner similar to CPM. 
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10.9 Conclusions 





Networks of some type arise in a wide variety of contexts. Network representations 
are very useful for portraying the relationships and connections between the compo- 
nents of systems. Frequently, flow of some type must be sent through a network, so 
a decision needs to be made on the best way to do this. The kinds of network 
optimization models and algorithms introduced in this chapter provide a powerful tool 
for making such decisions. 

The minimum cost flow problem plays a central role among these network 
optimization models, both because it is so broadly applicable and because it can be 
solved extremely efficiently by the network simplex method. Two of its special cases 
included in this chapter, the shortest path problem and the maximum flow problem, 
also are basic network optimization models, as are three additional special cases 
discussed in Chap. 7 (the transportation problem, the transshipment problem, and the 
assignment problem). 

Whereas all of these models are concerned with optimizing the operation of an 
existing network, the minimum spanning tree problem is a prominent example of a 
model for optimizing the design of a new network. 

This chapter has only scratched the surface of the current state of the art of 
network methodology. Because of their combinatorial nature, network problems often 
are extremely difficult to solve. However, great progress is being made in developing 
powerful modeling techniques and solution methodologies that are opening up new 
vistas for important applications. In fact, recent algorithmic advances are. enabling us 
to solve successfully some complex network problems of enormous size. 

The most widely used network technique has been the PERT-type system for 
project planning and control. It has been very valuable for organizing planning effort, 
testing alternative plans, revealing the overall. dimensions and details of the project 
plan, establishing well-understood management responsibilities, and identifying real- 
istic expectations for the project. It also lays the foundation for anticipatory manage- 
ment action against potential trouble spots during the course of the project. Although 
not a panacea, it has greatly aided project management on numerous occasions. 
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PROBLEMS 


1. Consider the following directed network. 











(a) Find a directed path from node A to node F, and then identify three other undirected 
paths from node A to node F. 

(b) Find three directed cycles. Then identify an undirected cycle that includes every 
node. 

(c) Identify a set of arcs that forms a spanning tree. 

(d) Use the process illustrated in Fig. 10.3 to grow a tree one arc at a time until a 
spanning tree has been formed. Then repeat this process to obtain another spanning 
tree. [Do not duplicate the spanning tree identified in part (c).] 


2. Ata small but growing airport, the local airline company is purchasing a new tractor 


for a tractor-trailer train to bring luggage to and from the airplanes. A new mechanized luggage 
system will be installed in 3 years, so the tractor will not be needed after that. However, 
because it will receive heavy use, so that the running and maintenance costs will increase 
rapidly as it ages, it may still be more economical to replace the tractor after 1 or 2 years. The 
following table gives the total net discounted cost associated with purchasing a tractor (purchase 
price minus trade-in allowance, plus running and maintenance costs) at the end of year i and 
trading it in at the end of year j (where year 0 is now). 
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j 
1 2 3 
8 18 31 
i 10 21 





The problem is to determine at what times (if any) the tractor should be replaced to minimize 
the total cost for the tractors over the 3 years. i 

(a) Formulate this problem as a shortest path problem. 

(b) Use the algorithm described in Sec. 10.3 to solve this shortest path problem. 


3.* Use the algorithm described in Sec. 10.3 to find the shortest path through networks 
(a) and (b), where the numbers represent actual distances between the corresponding nodes. 


(a) 





(Destination) 


4. Reconsider the student’s ‘‘car problem’’ described in Prob. 26 at the end of Chap. 7. 
(a) Formulate the student’s problem as a shortest path problem. 
(b) Use the algorithm described in Sec. 10.3 to solve this shortest path problem. 


5. Formulate the shortest path problem as a linear programming problem. 


6. A company has learned that a competitor. is planning to come out with a new kind 
of product with a great sales potential. This company has been working on a similar product, 
and research is nearly complete. It now wishes to rush the product out to meet the competition. 
There are four nonoverlapping phases left to be accomplished, including the remaining research 
that currently is being conducted at a normal pace. However, each phase can instead be con- 
ducted at a priority or crash level to expedite completion. The times required (in months) at 
these levels are 






















Time 
Design of 
Remaining Manufacturing | Initiate Production 
Research Development System and Distribution 





$30,000,000 is available for these four phases. The cost (in millions of dollars) at the different 
levels is 










Cost 
Design of 
Remaining Manufacturing | Initiate Production 
Level Research Development System and Distribution 





The problem is to determine at which level to conduct each of the four phases to minimize the 
total time until the product can be marketed subject to the budget restriction. 

(a) Formulate this problem as a shortest path problem. 

(b) Use the algorithm described in Sec. 10.3 to solve this shortest path problem. 


7.* Reconsider the networks shown in Prob. 1. Assume that the nodes and actual dis- 
tances between nodes are as shown there (where unspecified distances between nodes are greater 
than any of the given distances), but assume that the arcs have not yet been specified. Use the 
algorithm described in Sec. 10.4 to find the minimum spanning tree for each of these networks. 


8. A logging company will soon begin logging eight groves of trees in the same general 
area. Therefore, it must develop a system of dirt roads that makes each grove accessible from 
every other grove. The distance (in miles) between every pair of groves is 


Distance between Pairs of Groves 
Grove 1 2 3 4 5 6 7 8 


1 — 1.3 21 09 07 18 20 1.5 
2 13 — 09 18 12 26 23 Li 
3 21 09 — 26 17 25 19 10 
4 0.9 18 26 — 07 16 15 09 
5 0.7 12 17 07 — 09 11 08 
6 1.8 26 25 16 09 — 06 10 
7 20 23 #19 15 1l 06 — 0.5 
8 1.5 11 10 09 08 10 05 — 











The problem is to determine between which pairs of groves the roads should be constructed to 
connect all groves with a minimum total length of road. 
(a) Describe how this problem fits the network description of the minimum spanning 
tree problem. 
(b) Use the algorithm described in Sec. 10.4 to solve the problem. 


9. A bank soon will be hooking up computer terminals at each of its branch offices to 
the computer at its main office using special phone lines with telecommunications devices. The 
phone line from a branch office need not be connected directly to the main office. It can be 
connected indirectly by being connected to another branch office that is connected (directly or 
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indirectly) to the main office. The only requirement is that every branch office be connected 
by some route to the main office. 

The charge for the special phone lines is directly proportional to the mileage involved, 
where the distance (in miles) between every pair of offices is 


Distance between Pairs of Offices 
Main B.l B2 B3 B4 B5 





Main office — 190 70 115 270 
Branch 1 190 — 100 240 215 50 
Branch 2 70 100 — 140 120 
Branch 3 115 240 140 — 175 80 
Branch 4 270 215 120 175 — 
Branch 5 160 50 220 80 310 





The problem is to determine which pairs of offices should be directly connected by special 
phone lines in order to connect every branch office (directly or indirectly) to the main office at 
a minimum total cost. 
(a) Describe how this problem fits the network description of the minimum spanning 
tree problem. 
(b) Use the algorithm described in Sec. 10.4 to solve the problem. 


10.* For networks (a) and (b), use the augmenting path algorithm described in Sec. 
10.5 to find the flow pattern giving the maximum flow from the supply node (the left-most node) 
to the demand node (the right-most node), given that the arc capacity from node i to node j is 
the number nearest node i along the link between these nodes. 


(a) 


(b) 





F- 
Source 


11. Formulate the maximum flow problem as a linear programming problem. 


12. One track of the Eura Railroad system runs from the major industrial city of Faire- 
parc to the major port city of Portstown. This track is heavily used by both express passenger 
and freight trains. The passenger trains are carefully scheduled and have priority over the slow 
freight trains (this is a European railroad), so that the freight trains must pull over onto a siding 
whenever a passenger train is scheduled to pass them soon. It is now necessary to increase the 
freight service, so the problem is to schedule the freight trains so as to maximize the number 
that can be sent each day without interfering with the fixed schedule for passenger trains. 

Consecutive freight trains must maintain a schedule differential of at least 0.1 hour, and 
this is the time unit used for scheduling them (so that the daily schedule indicates the status of 
each freight train at times 0.0, 0.1, 0:2,..., 23.9). There are S sidings between Faireparc 
and Portstown, where siding 7 is long enough to hold n; freight trains @ = 1,... , S). It 
requires ft, time units (rounded up to an integer) for a freight train to travel from siding i to 
siding i + 1 (where tọ is the time from the Faireparc station to siding 1 and ¢, is the time from 
siding S$ to the Portstown station). A freight train is allowed to pass or leave siding i (i = 
0,1,..., 5) at time j (j = 0.0,0.1,... , 23.9) only if it would not be overtaken by a 
scheduled passenger train before reaching siding i + 1 (let ô, = 1 if it would not be overtaken, 
and let 6,, = 0 if it would be). A freight train also is required to stop at a siding if there will 
not be room for it at all subsequent sidings that it would reach before being overtaken by a 
passenger train. 

Formulate this problem as a maximum flow problem by identifying every node (including 
the supply node and the demand node) as well as every arc and its arc capacity for the network 
representation of the problem. (Hint: Use a different set of nodes for each of the 240 times.) 


13. Consider the maximum flow problem shown below, where the supply node is node 
A, the demand node is node F, and the arc capacities are the numbers shown next to these 
directed arcs. 





(a) Use the augmenting path algorithm described in Sec. 10.5 to solve this problem. 

(b) Formulate the network representation of this problem as a minimum cost flow prob- 
lem, including adding the arc A — F. Use F = 20. 

(c) Obtain an initial basic feasible solution by solving the feasible spanning tree with 
basic arcs A > B, A —> C, A —> F, B —> D, and E —> F, where two of the nonbasic 
arcs (E —> C and F —> D) are reverse arcs. 

(d) Use the network simplex method to solve this problem. 


14. A company will be producing the same new product at two different factories, and 
then the product must be shipped to two warehouses. Factory 1 can send an unlimited amount 
by rail to Warehouse 1 only, whereas Factory 2 can send an unlimited amount by rail to 
Warehouse 2 only. However, independent truckers can be used to ship up to 50 units from 
each factory to a distribution center, from which up to 50 units can be shipped to each ware- 
house. The shipping cost per unit for each alternative is shown in the following table, along 
with the amounts to be produced at the factories and the amounts needed at the warehouses. 
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From 


Factory 1 
Factory 2 


D. Center 





Unit Shipping Cost 


Distr. 
Center 


Warehouse 


1 2 Output 











Allocation 


(a) 


(b) 
(c) 


(d) 


15. 
(a) 


(6) 


16. 








Formulate the network representation of this problem as a minimum cost flow prob- 
lem. 

Formulate the linear programming model for this problem. 

Obtain an initial basic feasible solution by solving the feasible spanning tree that 
corresponds to using just the two rail lines plus Factory | shipping to Warehouse 2 
via the distribution center. 

Use the network simplex method to solve this problem. 


Reconsider Prob. 2. 

Now formulate this problem as a minimum cost flow problem by showing the 
appropriate network representation. 

Starting with the initial basic feasible solution that corresponds to replacing the 
tractor every year, use the network simplex method to solve this problem. 


For the P & T Co. transportation problem given in Table 7.2, consider its network 


representation as a minimum cost flow problem presented in Fig: 10.10. Use the northwest 
corner rule to obtain an initial basic feasible solution from Table 7.2. Then use the network 
simplex method to solve this problem (and verify the optimal solution given in Sec. 7.1). 


17. 
(a) 


(b) 


18. 
table: 


Consider the Metro Water District transportation problem presented in Table 7.12. 
Formulate the network representation of this problem as a minimum cost flow prob- 
lem. (Hint: Arcs where flow is prohibited should be deleted.) 

Starting with the initial basic feasible solution given in Table 7.19, use the network 
simplex method to solve this problem. Compare the sequence of basic feasible 
solutions obtained with the sequence obtained by the transportation simplex method 
in Table 7.23. 


Consider the transportation problem having the following cost and requirements 


Destination 











Formulate the network representation of this problem as a minimum cost flow problem. Use 
the northwest comer rule to obtain an initial basic feasible solution. Then use the network 
simplex method to solve the problem. 


19. Consider the minimum cost flow problem shown below, where the b; are given by 389 
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(usp = 40) 





[0] 








[80] [—60] 


Obtain an initial basic feasible solution by solving the feasible spanning tree with basic arcs 
A — C, B —> A, C > D, and C — E, where one of the nonbasic arcs (D — A) is a reverse 
arc. Then use the network simplex method to solve this problem. 


20.* Consider the following project network. Assume that the time required (in weeks) 
for each activity is a predictable constant and that it is given by the number along the corre- 
sponding arc. Find the earliest time, latest time, and slack for each event, as well as the slack 
for each activity. Also identify the critical path. 





21. Consider the following project network. Assume that the time required (in days) for 
each activity is a predictable constant and that it is given by the number along the corresponding 
arc. Find the earliest time, latest time, and slack for each event, as well as the slack for each 
activity. Also identify the critical path. 
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22. You and several friends are about to prepare a lasagne dinner. The tasks to be 
performed, their times (in minutes), and the precedence constraints are as follows: 








Task No. Tasks That Must Precede 








1 Buy the mozzarella cheese* 30 

2 Slice the mozzarella 5 1 

3 Beat 2 eggs 2 

4 Mix eggs and ricotta cheese 3 3 

5 Cut up onions and mushrooms 7 

6 Cook the tomato sauce 25 5 

7 Boil large quantity of water 15 

8 Boil the lasagne noodles 10 7 

9 Drain the lasagne noodles 2 8 
10 Assemble all the ingredients 10 9, 6, 4, 2 
ll Preheat the oven 15 

Bake the lasagne 





* There is none in the refrigerator. 


(a) Formulate this problem as a PERT-type system by drawing the project network. Use 
one event to represent the simultaneous initiation of the initial tasks. On one side 
of each arc, identify the number of the task being performed in parentheses, e.g., 
(Task 7). On the other side, show the times required. 

(b) Find the earliest time, latest time, and slack for each event, as well as the slack for 
each activity. Also identify the critical path. 

(c) Because of a phone call you were interrupted for 6 minutes when you should have 
been cutting the onions and mushrooms. By how much will the dinner be delayed? 
If you use your food processor, which reduces the cutting time from 7 minutes to 
2 minutes, will the dinner still be delayed? 


23.* Using the PERT three-estimate approach, the three estimates for one of the activ- 
ities are as follows: optimistic estimate = 30 days. most likely estimate = 36 days, pessimistic 
estimate = 48 days. What are the resulting estimates of the expected value and variance of 
the time required by the activity? 


24. Consider the following project network. 391 
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The PERT three-estimate approach has been used, and it has led to the following estimates of 
the expected value (in months) and variance of the time required for the respective activities: 





Activity Times 

Estimated Estimated 

Activity | Expected Value Variance 
1-2 4 5 
1-3 6 10 
2-4 4 8 
2-5 8 12 
3-4 3 6 
3—6 7 14 
4—5 5 12 
4—6 3 5 
5-7 5 8 
6—7 5 7 








The scheduled project completion time is 22 months after the start of the project. 

(a) Using expected values, determine the critical path for the project. 

(b) Using the procedure described in Sec. 10.8, find the approximate probability that 
the project will be completed by the scheduled time. 

(c) In addition to the critical path, there are five other paths through the network. For 
each of these other paths, find the approximate probability that the sum of the activity 
times along the path is not more than 22 months. 


25. Consider the following project network. 
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Using the PERT three-estimate approach, suppose that. the usual three estimates for the time 
required (in weeks) for each of these activities are 














Optimistic | Most Likely | Pessimistic 

Activity Estimate i. Estimate | Estimate 
1>2 28 32 36 
1-3 22 28 32 
2-6 26 36 46 
3—4 14 16 18 
3—5 32 32 32 
3-6 40 52 74 
4—5 12 16 24 
5—6 16 20 26 
5-7 26 34 42 
6—7 12 16 30 


The project is ready to start now and the deadline for completing the project is 100 weeks 
hence. 
(a) On the basis of the estimates just listed, calculate the expected value and standard 
deviation of the time required for each activity. 
(b) Using expected times, determine the critical path for the project. 
(c) Using the procedure described in Sec. 10.8, find the approximate probability that 
the project will be completed by the deadline. 


26. Reconsider the project network shown in Prob. 25. Suppose that the CPM method 
of time-cost trade-offs is to be used to determine how to meet the project deadline (100 weeks 
hence) in the most economical way. Also suppose that the crash time and normal time for each 
of the activities correspond to the times shown in the Optimistic Estimate and Pessimistic 
Estimate columns of the table for Prob. 25 (except that activity 3 — 5 has a crash time of 28 
and a normal time of 36) and that the difference between the crash cost and normal cost is 10 
(in units of thousands of dollars) for every activity. Formulate the linear programming model 
for this problem. 


27. Suppose that the scheduled completion time for the house-construction project de- 
scribed in Figs. 10.27 and 10.28 has been moved forward to 40. Therefore, the CPM method 
of time-cost trade-offs is to be used to determine how to accelerate the project to meet this 
deadline in the most economical way. The relevant data are: 







Normal | Crash 

Activity Time 
11> 2 2 1 
2> 3 4 2 
3—= 4 10 7 
4> 5 4 3 
4—> 6 6 4 
4> 7 7 5 
5> 7 5 3 
6—> 8 F 4 
I> 9 8 6 
9 6 

4 3 

5 3 

2 1 

6 3 











Formulate the linear programming model for this problem. 
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Dynamic Programming 


Dynamic programming is a useful mathematical technique for making a sequence of 
interrelated decisions. It provides a systematic procedure for determining the optimal 
combination of decisions. 

In contrast to linear programming, there does not exist a standard mathematical 
formulation of ‘‘the’’ dynamic programming problem. Rather, dynamic programming 
is a general type of approach to problem solving, and the particular equations used 
must be developed to fit each individual situation. Therefore, a certain degree of 
ingenuity and insight into the general structure of dynamic programming problems is 
required to recognize when a problem can be solved by dynamic programming pro- 
cedures and how it can be done. These abilities can best be developed by an exposure 
to a wide variety of dynamic programming applications and a study of the character- 
istics that are common to all these situations. A large number of illustrative examples 
are presented for this purpose. 


393 


394 


Mathematical 
Programming 





Figure 11.1 The road system and costs for the stagecoach problem. 


11.1 Prototype Example 


The STAGECOACH PROBLEM is a problem especially constructed! to illustrate the 
features and to introduce the terminology of dynamic programming. It concerns a 
mythical fortune seeker in Missouri who decided to go west to join the 49’er gold 
rush in California during the mid-nineteenth century. The journey would require trav- 
eling by stagecoach through unsettled country where there was serious danger of attack 
by marauders. Although his starting point and destination were fixed, he had consid- 
erable choice as to which states (or territories that subsequently became states) to 
travel through en route. The possible routes are shown in Fig. 11.1, where each state 
is represented by a lettered circle. Thus four stages (stagecoach runs) were required 
to travel from his point of embarkation in state A (Missouri) to his destination in state 
J (California). 

This fortune seeker was a prudent man who was quite concerned about his safety. 
After some thought, he came up with a rather clever way: of determining the safest 
route. Life insurance policies were offered to stagecoach passengers. Because the cost 
of the policy for taking any given stagecoach run was based on a careful evaluation 
of the safety of that run, the safest route should be the one with the cheapest total 
life insurance policy. 

The cost for the standard policy on the stagecoach run from state i to state j, 
which will be denoted by Cis is 





B C D E F G 
afafa] BI7]|4 
c|3]2 
4/1 








felis 


These costs are also shown in Fig. 11.1. 


1 This problem was developed by Professor Harvey M. Wagner while he was at Stanford University. 


We shall now focus on the question of which route minimizes the total cost of 
the policy. 


Solving the Problem 


First note that the shortsighted approach of selecting the cheapest run offered by each 
successive stage need not yield an overall optimal decision. Following this strategy 
would give the route A —> B —> F —> I — J, at a total cost of 13. However, sacrificing 
a little on one stage may permit greater savings thereafter. For example, A — D —> 
F is cheaper overall than A —> B — F. 

One possible approach to solving this problem is to use trial and error.! How- 
ever, the number of possible routes is large (18) and having to calculate the total cost 
for each route is not an appealing task. 

Fortunately, dynamic programming provides a solution with much less effort 
than exhaustive enumeration. (The computational savings are enormous for larger 
versions of this problem.) Dynamic programming starts with a small portion of the 
original problem and finds the optimal solution for this smaller problem. It then 
gradually enlarges the problem, finding the current optimal solution from the preceding 
one, until the original problem is solved in its entirety. 

For the stagecoach problem, we start with the smaller problem where the fortune 
seeker has nearly completed his journey and has only one more stage (stagecoach run) 
to go. The obvious optimal solution for this smaller problem is to go from his current 
state (whatever it is) to his ultimate destination (state J). At each subsequent iteration, 
the problem is enlarged by increasing by one the number of stages left to go to 
complete the journey. For this enlarged problem, the optimal solution for where to 
go next from each possible state can be found relatively easily from the results obtained 
at the preceding iteration. The details involved in implementing this approach follow. 


FORMULATION: Let the decision variables x, (an = 1, 2, 3, 4) be the immediate 
destination on stage n (the nth stagecoach run to be taken). Thus the route selected 
is A > xX) > X%) > X3 > X4, where x, = J. 

Let f,,(s, x„) be the total cost of the best overall policy for the remaining stages, 
given that the fortune seeker is in state s, ready to start stage n, and selects x, as the 
immediate destination. Given s and n, let x* denote the value of x, that minimizes 
f,(S, X,), and let f(s) be the corresponding minimum value of f,,(s, x,,). Thus 





Fa) = min f(s, Xn) = f(s, Xa) 


Xn 
where 


f,,(S, X,) = immediate cost (stage n) + minimum future cost (stages n + 1 onward) 


z Csxn oF fie). 


1 This problem also can be formulated as a shortest path problem (see Sec. 10.3), where costs here play 
the role of distances in the shortest path problem, The solution procedure presented in Sec. 10.3 actually 
uses the philosophy of dynamic programming. However, because the present problem has a fixed number 
of stages, the dynamic programming approach presented here is even better. 
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The value of c,,,, is given by the preceding tables for c; by setting i = s (the current 
state) and j = x, (the immediate destination). Because the ultimate destination (state 
J) is reached at the end of stage 4, fs(J) = 0. 

The objective is to find f(A) and the corresponding route. Dynamic program- 


- ming finds it by successively finding f7(s), 3(s), £3(s) for each of the possible states 


s and then using f3(s) to solve for f{(A).! 


SOLUTION PROCEDURE: When the fortune seeker has only one more stage to go 
(n = 4), his route thereafter is determined entirely by his current state s (either H or 
J) and his final destination, x, = J, so the route for this final stagecoach run is s —> 
J. Therefore, since fi(s) = f4(s, J) = c, j, the immediate solution to the n = 4 
problem is 





When the fortune seeker has two more stages to go (n = 3), the solution 
procedure. requires a few calculations. For example, suppose that the fortune seeker 
is in state F. Then, as depicted below, he must next go.to either state H or I at an 
immediate cost of cpy = 6 or cz; = 3, respectively. If he chooses state H, the 
minimum additional cost after he reaches there is given in the preceding table as 
f3(H) = 3, as shown next to the H box in the diagram. Therefore, the total cost for 
this decision would be 6 + 3 = 9. If he chooses state I instead, the total cost is 
3 + 4 = 7, which is smaller. Therefore, the optimal choice would be this latter 
one, a; = I, because it gives the minimum cost, E) = 7, 


3 
6 
4 


Similar calculations need to be made when starting from the other two possible 
states, s = E and s = G, with two stages to go. Try it, proceeding both graphically 


! Because this procedure involves moving backward stage by stage, some writers also count n backward 
to denote the number of remaining stages to the destination. We use the more natural forward counting 
for greater simplicity. 


(Fig. 11.1) and algebraically [combining c;; and f *(s) values], to verify the following 
complete results for the n = 3 problem. 


F8,X3) = Cx, + Fixa 











I 
8 
7 
7 


The solution for the three-stage problem (n = 2) is obtained in a similar fashion. 


In this case, fa(s,x2) = Cy, + f3(%)). For example, suppose that the fortune seeker 
is in state C, as depicted below. 





6 
He must next go to state E, F, or G at an immediate cost of cog = 3, Cop = 2, or 
Cco,g = 4, respectively. After getting there, the minimum additional cost for stage 3 
to the end is given by the n = 3 table as f3(E) = 4, f3(F) = 7, or f3(G) = 6, 


respectively, as shown next to the E, F, and G states in the above diagram. The 
resulting calculations for the three alternatives are summarized below. 


x, = E: FAC, E) = cog + E) = 3 44 =. 
“v = F: fC, F) = cop + fE) =2+7=9. 
%, =G: fC, G) = coe + FG) = 4 + 6 = 10. 


The minimum of these three numbers is 7, so the minimum total cost from state C to 
the end is f3(C) = 7 and the immediate destination should be x} = E. 

Making similar calculations when starting from state B or D (try it) yields the 
following results for the n = 2 problem: 


F28, x) = Cox, + f $02) 
n=2: E F G F309) 





11 11 12 
9 10 
8 il 
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Moving to the four-stage problem (n =. 1), the calculations are similar to those 
just shown for the three-stage problem (n = 2), except now there is just one possible 
starting state, s = A, as depicted below. 


1] 





8 


These calculations are summarized next for the three alternatives for the immediate 
destination: 


x, =B: f(A, B) = cy, + XB) = 24 11 = 13. 
x, =D: fiA, D) = cp + fO) = 3 + 8 = 11. 


Since 11 is the minimum, f{(A) = 11 and xf = C or D, as shown in the following 
table. 





The optimal solution for the entire problem can now be identified. Results for 
the n = 1 problem indicate that the fortune seeker should go initially to either state 
C or state D. Suppose that he chooses xý = C. For n = 2, the result for s = C is 
x> = E. This result leads to the n = 3 problem, which gives x} = H for s = E, 
and the n = 4 problem yields xj = J for s = H. Hence one optimal route is 
A — C > E —> H > J. Choosing x} = D leads to the other two optimal routes, 
A-D—-E>~H—JandA>~D—F-I- J. They all yield a total cost 
of f7(A) = 11. 

You will see in the next section that the special terms describing the particular 
context of this problem—stage, state, policy—actually are part of the general ter- 
minology of dynamic programming with an analogous interpretation in other contexts. 


11.2 Characteristics of Dynamic Programming Problems 


The stagecoach problem is a literal prototype of dynamic programming problems. In 
fact, this example was purposely designed to provide a literal physical interpretation 
of the rather abstract structure of such problems. Therefore, one way to recognize a 


situation that can be formulated as a dynamic programming problem is to notice that 
its basic structure is analogous to that of the stagecoach problem. 

These basic features that characterize dynamic programming problems are pre- 
sented and discussed here. 


1. The problem can be divided into stages, with a policy decision required at 
each stage. 

The stagecoach problem was literally divided into its four stages (stagecoaches) 
that correspond to the four legs of the journey. The policy decision at each stage was 
which life insurance policy to choose (i.e., which destination to select for the next 
stagecoach ride). Similarly, other dynamic programming problems require making a 
sequence of interrelated decisions, where each decision corresponds to one stage of 
the problem. 


2. Each stage has a number of states associated with it. 

The states associated with each stage in the stagecoach problem were the states 
(or territories) in which the fortune seeker could be located when embarking on that 
particular leg of the journey. In general, the states are the various possible conditions 
in which the system might be at that stage of the problem. The number of states may 
be either finite (as in the stagecoach problem) or infinite (as in some subsequent 
examples). 


3. The effect of the policy decision at each stage is to transform the current 
state into a state associated with the next stage (possibly according to a probability 
distribution). 

The fortune seeker’s decision as to his next destination led him from his current 
state to the next state on his journey. This procedure suggests that dynamic program- 
ming problems can be interpreted in terms of the networks described in Chap. 10. 
Each node would correspond to a state. The network would consist of columns of 
nodes, with each column corresponding to a stage, so that the flow from a node can 
go only to a node in the next column to the right. The links from a node to nodes in 
the next column correspond to the possible policy decisions on which state to go to 
next. The value assigned to each link usually can be interpreted as the immediate 
contribution to the objective function from making that policy decision. In most cases, 
the objective corresponds to finding either the shortest or the longest route through 
the network. 


4. The solution procedure is designed to find an optimal policy for the overall 
problem, i.e., a prescription of the optimal policy decision at each stage for each of 
the possible states. 

For the stagecoach problem, the solution procedure constructed a table for each 
stage (n) that prescribes the optimal decision (x*) for each possible state (s). Thus, 
in addition to identifying three optimal solutions (optimal routes) for the overall prob- 
lem, the results also show the fortune seeker how he should proceed if he gets detoured 
to a state that is not on an optimal route. For any problem, dynamic programming 
provides this kind of policy prescription of what to do under every possible circum- 
stance (which is why the actual decision made upon reaching a particular state at a 
given stage is referred to as a policy decision). Providing this additional information 
beyond simply specifying an optimal solution (optimal sequence of decisions) can be 
helpful in a variety of ways, including sensitivity analysis. 
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5. Given the current state, an optimal policy for the remaining stages is inde- 
pendent of the policy adopted in previous stages. (This is the principle of optimality 
for dynamic programming.) 

Given the state in which the fortune seeker is currently located, the optimal life 
insurance policy (and its associated route) from this point onward is independent of 
how he got there. For dynamic programming problems. in general, knowledge of the 
current state of the system conveys all the information about its previous. behavior 
necessary for determining the optimal policy henceforth: (This property is. the Mar- 
kovian property discussed in Sec. 15.3.) Any problem lacking this property cannot 
be formulated as a dynamic programming problem. 


6. The solution procedure begins by finding the optimal policy for the last stage. 

The optimal policy for the last stage prescribes the optimal policy decision for 
each of the possible states at that stage. The solution of this one-stage problem is 
usually trivial, as it was for the stagecoach problem. 


7. A recursive relationship that identifies the optimal policy for stage n, given 
the optimal policy for stage (n + 1), is available. 
For the stagecoach problem, this recursive relationship was 


F205) = min tags T. Fin) 


Therefore, finding the optimal policy decision when starting in state s at stage n 
requires finding the minimizing value of x,. The corresponding minimum cost is 
achieved by using this value of x, and then following the optimal policy when starting 
in state x, at stage (n + 1). 

The precise form of the recursive relationship differs somewhat among dynamic 
programming problems. However, notation analogous to that introduced in the pre- 
ceding section will continue to be used here, as summarized below. 


N = number of stages. 

n = label for current stage (n = 1, 2,...,N). 

S„ = current state for stage n. 

x, = decision variable for stage n. 

x* = optimal value of x, (given s,,). 

f,ASp» Xn) = contribution of stages n, n + 1,..., N to the objective function 
if the system starts in state s, at stage n, the immediate decision is 
x,; and optimal decisions are made thereafter. 


FACS) = FnlSu» x5). 
The recursive relationship will always be of the form 


Fis.) = max {f,s, XJ} or FAS) = min {f,(s,, x,)}, 
Xn Xn 

where f,(s,, X,) would be written in terms of sp, x,, f%4.,(S,41) and probably some 

measure of the first-stage effectiveness (or ineffectiveness) of x,- 

The recursive relationship is given its name because it keeps recurring as we 
move backward stage by stage. When the current stage number n is decreased by one, 
the new f*(s,,) function is derived by using the f*, ,(s,,,) function that was just 
derived during the preceding iteration, and: then this process keeps repeating. This 
property is emphasized in our next (and final) characteristic of dynamic programming. 


8. When we use this recursive relationship, the solution procedure moves back- 
ward stage by stage—each time finding the optimal policy for that stage—until it 
finds the optimal policy starting at the initial stage. 

This backward movement was demonstrated by the stagecoach problem, where 
the optimal policy was found successively beginning in each state at stages 4, 3, 2, 
and 1, respectively.! For all dynamic programming problems, a table such as the 
following one would be obtained for each stage (n = N, N — 1,..., 1). 


Se | fin z | GS 








When this table is finally obtained for the initial stage (n = 1), the problem of interest 
is solved. Because the initial state is known, the initial decision is specified by x} in 
this table. The optimal value of the other decision variables is then specified by the 
other tables in turn according to the state of the system that results from the preceding 
decisions. 
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11.3 Deterministic Dynamic Programming 


This section further elaborates upon the dynamic programming approach to determin- 
istic problems, where the state at the next stage is completely determined by the state 
and policy decision at the current stage. The probabilistic case, where there is a 
probability distribution for what the next state will be, is discussed in the next section. 

Deterministic dynamic programming can be described diagrammatically as 
shown in Fig. 11.2. Thus at stage n the process will be in some state s,. Making 
policy decision x, then moves the process to some state s,,, at stage (n + 1). The 
contribution thereafter to the objective function under an optimal policy has been 
previously calculated to be f*,,(s,4,). The policy decision x, also makes some 
contribution to the objective function. Combining these two quantities in an appro- 
priate way provides f,(s,, x,), the contribution of stages n onward to the objective 
function. Optimizing with respect to x, then gives f*(s,) = f,(s,, x2). After finding 
x* and f;(s,,) for each possible value of s„, the solution procedure is ready to move 
back one stage. 


Stage Stage 
n n+1 
State: ete ee ee S Gs) 
© Contribution 
SalSn, Xn) of x, Sah 1(Sn41) 


Figure 11.2 The basic structure for deterministic dynamic programming. 


' Actually, for this problem the solution procedure can move either backward or forward. However, for 
many problems (especially when the stages correspond to time periods), the solution procedure must move 
backward. 
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One way of categorizing deterministic. dynamic programming problems is by 
the form of the objective function. For example, the objective might be to minimize 
the sum of the contributions from the individual stages (as for the stagecoach problem), 
or to maximize such a sum, or to minimize a product of such terms, and so on. 
Another categorization is in terms of the nature of the set of states for the respective 
stages. In particular, the states s, might be representable by a discrete. state variable 
(as for the stagecoach problem), or by a continuous state variable, or perhaps a state 
vector (more than one variable) is required. 

Several examples are presented to illustrate these various possibilities. More 
important, they illustrate that these apparently major differences are actually quite 
inconsequential (except in terms of computational difficulty) because the underlying 
basic structure shown in Fig. 11.2 always remains the same. 

The first new example arises in a much different context from the stagecoach 
problem, but it has the same mathematical formulation except that the objective is to 
maximize rather than minimize a sum. 


Example 2— Distributing Medical Teams to Countries 


The WORLD HEALTH COUNCIL is devoted to improving health care in the under- 
developed countries of the world. It now has five medical teams available to allocate 
among three such countries to improve their medical care, health education, and 
training programs. Therefore, the council needs to determine how many teams (if 
any) to allocate to each of these countries to maximize the total effectiveness of the 
five teams. The teams must be kept intact, so the number allocated to each country 
must be integer. 

The measure of performance being used is additional person-years of life. (For 
a particular country, this measure equals the country’s increased life expectancy in 
years times its population.) Table 11.1 gives the estimated additional person-years of 
life (in multiples of 1,000) for each country for each possible allocation of medical 
teams. 

Which allocation maximizes the measure of performance? 


FORMULATION: This problem requires making three interrelated decisions, namely, 
how many medical teams to allocate to each of the three countries. Therefore, even 
though there is no fixed sequence, these three countries can be considered as the three 


Table 11.1 Data for the World Health 
Council Problem 


Thousands of Additional 
Person-Years of Life 


No. of Medical Country 
Teams 1 2 3 








Stages in a dynamic programming formulation. The decision variables x, (n = 1, 2, 3) 403 
would be the number of teams to allocate to stage (country) n. Dynamic Programming 
The identification of the states may not be readily apparent. To determine the 

states, we ask questions such as the following. What is it that changes from one stage 

to the next? Given that the decisions have been made at the previous stages, how can 

the status of the situation at the current stage be described? What information about 

the current state of affairs is necessary to determine the optimal policy hereafter? On 

these bases, an appropriate choice for the ‘‘state of the system’’ is 


S„ = number of medical teams still available for allocation to the 
remaining countries (n, . . . , 3). 


Thus, at stage 1 (country 1), where all three countries remain under consideration for 
allocations, s} = 5. However, at stage 2 or 3 (country 2 or 3), s„ is just 5 minus the 
number of teams allocated at preceding stages. With the dynamic programming pro- 
cedure of solving backward stage by stage, when we are solving at stage 2 or 3, we 
shall not yet have solved for the allocations at the preceding stages. Therefore, we 
shall consider every possible state we could be in at stage 2 or 3, namely, s, = 0, 
1, 2, 3, 4, or 5. 

Let p,(x;) be the measure of performance from allocating x; medical teams to 
country i, as given in Table 11.1. Thus the objective is to choose x,, x2, x3 SO as to 


3 
Maximize S pdx), 
i=l 


3 
subject to eas, 
i=l] 
and the x, are nonnegative integers. 


Using the notation presented in Sec. 11.2, f,(s,, x,) is then 
3 
Ff Sn Xn) = PX) + maximum 5 px), 


i=n+1 


where the maximum is taken over x4), ... , X3 such that 
3 
> = Sy 
i=n 


and the x; are nonnegative integers, for n = 1, 2, 3. In addition, 


fin) = max f,(S,, Xn) 


Xn=0,1,..., Sn 
Therefore, f(Sn> Xn) = Pa &n) + Fron — Xp) 


(with fł defined to be zero). These basic relationships are summarized in Fig. 11.3. 
Consequently, the recursive relationship relating the f{, fž, and f3 functions 
for this problem is 
Fan) = max {p,,) + fra, — 2) form = 1, 2. 


Xn=0,1,...,8n 
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Stage Stage 
n n+1 
State: G) Sa »( Sa X, 
Value: Ja (S,, Xp) Pn n) Pex (Sp ~ X,) 


= Pa n) + fiat Gs, j X,) 
Figure 11.3 The basic structure for the World Health Council problem. 


For the last stage (n = 3), 


f3(s3) = max p(x). 
x3=0,1,...,53 


The resulting dynamic programming calculations are given next. 


SOLUTION PROCEDURE: Beginning with the last stage (n = 3), we note that the 
values of p3(x,) are given in the last column of Table 11.1, and that these values keep 
increasing as we move down the column. Therefore, with s, medical teams still 
available for allocation to country 3, the maximum of p,(x3) is automatically achieved 
by allocating all s, teams, so x} = s3 and f3(s3) = p3(s3) as shown in the following 
table. 


MAb WY © 





We now move backward to start from the next-to-last stage (n = 2). Here, 
finding x} requires calculating and comparing f(s, x,) for the alternative values of 
X, namely, x, = 0, 1,..., 8. To illustrate, we depict this situation when s, = 2 
graphically: 


0 


50 


5 
sue: OLN 
0 


4 


70 


This diagram corresponds to Fig. 11.3 except that all three possible states at stage 
3 are shown. Thus, if x, = 0, the resulting state at stage 3 will be s, — x, = 2 — 
0 = 2, whereas x, = 1 leads to state 1 and x, = 2 leads to state 0. The corresponding 


values of p,(x,) from the country 2 column of Table 11.1 are shown along the links, 405 
and the values of f3(s,) — x) from the n = 3 table are given next to the stage 3 Dynamic Programming 
nodes. The required calculations for this case of s, = 2 are summarized below. 


x = 0: fo(2, 0) = p,(0) + FEQ) = 0 +70 = 70. 
x = 1: fo(2, 1) = p1) + FEC) = 20 + 50 = 70. 
xX = 2: f2, 2) = p,(2) + FEO) = 45 + 0 = 45. 


Because the objective is maximization, x3 = 0 or 1 with f3(2) = 70. 
Proceeding in a similar way with the other possible values of s, (try it) yields 
the following table. 


F,(S2, X2) = P) + f KO -= x) 




















We now are ready to move backward to solve the original problem where we 
are starting from stage 1 (n = 1). In this case, the only state to be considered is the 
starting state of s; = 5, as depicted below. 





Since allocating x, medical teams to country 1 leads to a state of (5 — x,) at stage 
2, a choice of x, = 0 leads to the bottom node on the right, x, = 1 leads to the next 
node up, and so forth up to the top node with x, = 5. The corresponding p,(x,) values 
from Table 11.1 are shown next to the links. The numbers next to the nodes are 
obtained from the f3(s,) column of the n = 2 table. As with n = 2, the calculation 
needed for each alternative value of the decision variable involves adding the corre- 
sponding link value and node value, as summarized below. 


x = 0: f,(5, 0) = pO) + F865) = 0 + 160 = 160. 


x = 1: f5, D = py) + f394 = 45 + 125 = 170. 


gs: f,(5, 5) = p,(5) + #80) = 120 + 0 = 120. 
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The similar calculations for x, = 2, 3, 4 (try it) verify that xf = 1 with f¥(5) = 
170, as shown in the following table. 






RRE x) = p) + RRE = 4) 


Opp h e w 
fe [re fs Pow os || 





Thus the optimal solution has x} = 1, which makes s, = 5 — 1 = 4, so 
xŠ = 3, which makes s, = 4 — 3 = 1, sox3 = 1. Since f{(5) = 170, this (1, 3, 1) 
allocation of medical teams to the three countries will yield an estimated total of 
170,000 additional person-years of life, which is at least 5,000 more than for any 
other allocation. 


A Prevalent Problem Type—The Distribution of Effort Problem 


The preceding example illustrates a particularly common type of dynamic program- 
ming problem called the distribution of effort problem. For this type of problem, there 
is just one kind of resource that is to be allocated to a number of activities. The 
objective is to determine how to distribute the effort (the resource) among the activities 
most effectively. For the World Health Council example, the resource involved is the 
medical teams, and the three activities are the health care work in the three countries. 


ASSUMPTIONS: This interpretation of allocating resources to activities should ring 
a bell for you, because it is the typical interpretation for linear programming problems 
given at the beginning of Chap. 3. However, there also are some key differences 
between the distribution of effort problem and linear programming that help illuminate 
the general distinctions between dynamic programming and other areas of mathemat- 
ical programming. 

One key difference is that the distribution of effort problem involves only one 
resource (one functional constraint), whereas linear programming can deal with 
hundreds or even thousands of resources. (In principle; dynamic programming can 
handle slightly more than one resource, as we shall illustrate in Example 5 by solving 
the three-resource Wyndor Glass Co. problem, but it quickly becomes very inefficient 
when the number of resources is increased.) 

On the other hand, the distribution of effort problem is far more general than 
linear programming in other ways. Consider the four assumptions of linear program- 
ming presented in Sec. 3.3—proportionality, additivity, divisibility, and certainty. 
Proportionality is routinely violated by nearly all dynamic programming problems, 
including distribution of effort problems (e.g., Table 11.1 violates proportionality). 
Divisibility also is often violated, as in Example 2, where the decision variables must 
be integer. In fact, dynamic programming calculations become more complex when 
divisibility does hold (as in Examples 4 and 5). Although we shall consider the 
distribution of effort problem only under the assumption of certainty, this is not 
necessary, and many other dynamic programming problems violate this assumption 
as well (as described in Sec. 11.4). 


Of the four assumptions of linear programming, the only one needed by the 
distribution of effort problem (or other dynamic programming problems) is additivity 
(or its analog for functions involving a product of terms). This assumption is needed 
to satisfy the principle of optimality for dynamic programming (characteristic 5 in 
Sec. 11.2). 


FORMULATION: Because they always involve allocating one kind of resource to a 
number of activities, distribution of effort problems always have the following dy- 
namic programming formulation (where the ordering of the activities is arbitrary): 


Stage n = activity n (an = 1,2,...,N). 
X, = amount of the resource allocated to activity n. 
State s, = amount of the resource still available for allocation to the 


remaining activities (n, ..., N). 


Therefore, when starting at stage n in state s,, the choice of x, always results in the 
next state at stage (n + 1) being s,,, = (s, — Xp), as depicted below: 


Stage: n n+1 


WO 


Note how the structure of this diagram corresponds to the one shown in Fig. 
11.3 for the World Health Council example of a distribution of effort problem. What 
will differ from one such example to the next is the rest of what is shown in Fig. 
11.3, namely, the relationship between f,(s,, Xn) and f;,,(s, — x,), and then the 
resulting recursive relationship between the f* and f;,, functions. These relation- 
ships depend on the particular objective function for the overall problem. 

The structure of the next example is similar to the one for the World Health 
Council because it too is a distribution of effort problem. However, its recursive 
relationship differs in that its objective is to minimize a product of terms for the 
respective stages. 

At first glance this example may appear not to be a deterministic dynamic 
programming problem because probabilities are involved. However, it does indeed fit 
our definition because the state at the next stage is completely determined by the state 
and policy decision at the current stage. 





Example 3— Distributing Scientists to Research Teams 


A government space project is conducting research on a certain engineering problem 
that must be solved before people can fly safely to Mars. Three research teams are 
currently trying three different approaches for solving this problem. The estimate has 
been made that, under present circumstances, the probability that the respective 
teams—call them 1, 2, and 3—will not succeed is 0.40, 0.60, and 0.80, respectively. 
Thus the current probability that all three teams will fail is (0.40)(0.60)(0.80) = 
0.192. Because the objective is to minimize the probability of failure, two more top 
scientists have been assigned to the project. 

Table 11.2 gives the estimated probability that the respective teams will fail 
when 0, 1, or 2 additional scientists are added to that team. Only integer numbers of 
scientists are considered because each new scientist will need to devote full attention 
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Table 11.2. Data for. the Government 
Space Project Problem 









Probability of Failure 
Team 
1 2 3 


0.40 0.60 0.80 
0.20 0.40 0.50 
0.15 0.20 0.30 


Number of New 
Scientists 





to one team. The problem is to determine how to allocate the two additional scientists 
to minimize the probability that all three teams will fail. 


FORMULATION: Because both Examples 2 and 3 are distribution of effort problems, 
their underlying structure is actually very similar. In this case, scientists replace med- 
ical teams as the kind of resource involved, and research teams replace countries as 
the activities. Therefore, instead of medical teams being allocated to countries, sci- 
entists are being allocated to research teams. The only basic difference between the 
two problems is in their objective functions. 

With so few scientists and teams involved, this problem could be solved very 
easily by a process of exhaustive enumeration. However, the dynamic programming 
solution is presented for illustrative purposes. 

In this case, stage n (n = 1, 2, 3) corresponds to research team n, and the state 
S„ is the number of new scientists still available for allocation to the remaining teams. 
The decision variables x, (n = 1, 2, 3) are the number of additional scientists allocated 
to team n. 

Let p,x;) denote the probability of failure for team 1 if it is assigned x; additional 
scientists, as given by Table 11.2. Letting II denote multiplication, the government’ s 
objective is to choose x,, X2, X3 SO as to 


3 
Minimize I Pi&i) = PEPP), 


3 
subject to 5 x) = 2 

i=l 
and the x; are nonnegative integers. 


Consequently, f,(s,, X,,) for this problem is 
3 
Ff ASn> Xn) = P,x,) © minimum Il PAX), 
i=n+1 


where the minimum is taken over x,,4,, - . - , X3 such that 


3 
> Xx; = Sy 
i=n 


and the x; are nonnegative integers, 


Stage Stage 409 





n n+1 A > 
x, Dynamic Programming 
State: Cs) —w({ S, — X, 
Value: Ja (Sn Xp) Prin) Fixi (s, B Xa) 


= Pan) i fra, am Xp) 
Figure 11.4 The basic structure for the government space project problem. 


forn = 1, 2, 3. Thus 


ËC) = min f,(S,. X,)- 


Xn=5n 


Hence Fano Xn) = Pay) * Te ily — Xn) 


(with f$ defined to be one). Figure 11.4 summarizes these basic relationships. 


Thus the recursive relationship relating the f;, £3, and f3 functions in this case 
is 


Fils) = min {P n) . Faan E Xh for n = 1, 2, 


XnZSSn 


and, when n = 3, 


F3(53) = min p,(x3). 


x3%53 


SOLUTION PROCEDURE: The resulting dynamic programming calculations are 



























n= 3 
n=2 
F (Sy, x) = py) - FMS -= x) 
n= 1: Sy 2 | fied | xt 
2 0.072 0.060 1 
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Therefore, the optimal solution must have xï = 1, which makes s, = 2 — 1 = I, 
so that x = 0, which makes s, = 1 — 0 = 1, so that xš = 1. Thus teams 1 and 
3 should each receive one additional scientist. The new probability that all three teams 
will fail would then be 0.060. 

All the examples thus far have had a discrete state variable s, at each stage. 
Furthermore, they all have been reversible in the sense that the solution procedure 
actually could have moved either backward or forward stage by stage. (The latter 
alternative amounts to renumbering the stages in reverse order and then applying the 
procedure in the standard way.) This reversibility is a general characteristic of distri- 
bution of effort problems such as Examples 2 and 3, since the activities (stages) can 
be ordered in any desired manner. 

The next example is different in both respects. Rather than being restricted to 
integer values, its state variable s, at stage n is a continuous variable that can take on 
any value over certain intervals. Since s, now has an infinite number of values, it is 
no longer possible to consider each of its feasible values individually. Rather, the 
solution for f¥(s„) and x must be expressed as functions of s,. Furthermore, this 
example is not reversible because its stages correspond to time periods, so the solution 
procedure must proceed backward. 


Example 4—Scheduling Employment Levels 


The workload for the LOCAL JOB SHOP is subject to considerable seasonal fluctua- 
tion. However, machine operators are difficult to hire and costly to train, so the 
manager is reluctant to lay off workers. during the slack seasons. He is likewise 
reluctant to maintain his peak season payroll when it is not required. Furthermore, he 
is definitely opposed to overtime work on a regular basis. Since all work is done to 
custom orders, it is not possible to build up inventories during slack seasons. There- 
fore, the manager is in a dilemma as to what his policy should be regarding employ- 
ment levels. 

The following estimates are given for the minimum employment requirements 
during the four seasons of the year for the foreseeable future: 





Season je Summer Autumn Winter Spring 


Requirements 255 220 240 200 255 


Employment will not be permitted to fall below these levels. Any employment above 
these levels is wasted at an approximate cost of $2,000/person/season. It is estimated 
that the hiring and firing costs are such that the total cost of changing the level of 
employment from one season to the next is $200 times the square of the difference 
in employment levels. Fractional levels of employment are possible because of a few 
part-time employees, and the cost data also apply on a fractional basis. 


FORMULATION: On the basis of the data available, it is not worthwhile to have the 
employment level go above the peak season requirements of 255. Therefore, spring 
employment should be at 255, and the problem is reduced to finding the employment 
level for the other three seasons. 

For a dynamic programming formulation, the seasons should be the stages. 
There are actually an indefinite number of stages because the problem extends into 


the indefinite future. However, each year begins an identical cycle, and because spring 411 
employment is known, it is possible to consider only one cycle of four seasons ending 


: Dynamic Programming 
with the spring season, as summarized below. 


Stage 1 = summer, 

Stage 2 = autumn, 

Stage 3 = winter, 

Stage 4 = spring. 
x, = employment level for stage n (n = 1, 2, 3, 4). 
(xq = 255.) 


It is necessary that the spring season be the last stage because the optimal value 
of the decision variable for each state at the last stage must be either known or 
obtainable without considering other stages. For every other season, the solution for 
the optimal employment level must consider the effect on costs in the following 
season. 

Let 


r, = minimum employment requirement for stage n, 
where these requirements were given earlier as 7, = 220, r, = 240, r} = 200, and 
r4 = 255. Thus the only feasible values for x, are 
ry SX, = 255. 
Referring to the cost data given in the problem statement, 
Cost for stage n = 200(x, — x,—1)? + 2,000(x, — rp) 


Note that the cost at the current stage depends only upon the current decision 
x, and the employment in the preceding season x, _,. Thus the preceding employment 
level is all the information about the current state of affairs that we need to determine 
the optimal policy henceforth. Therefore, the state s, for stage n is 


State s, = x 


a= 
When n = 1, 5; = X% = x, = 255. 
For your ease of reference while working through the problem, a summary of 
the above data is given in Table 11.3. 
The objective for the problem is to choose x,, xz, x3 (with x, = 255) so as to 





t 


4 
Minimize [200(x; — x;.,)* + 2,000(%; — 7;)], 
=1 


subject to r, = x, = 255, fori = 1, 2, 3, 4. 


l 


Table 11.3 Data for the Local Job Shop Problem 


3 


Feasible x, Possible s, = x,-1 Cost 





220 = x, = 255 sı = 255 200(x, — 255)? + 2,000(x, — 220) 

240 <x, = 255 220 = s, = 255 200(x, — xı)? + 2,000(x, — 240) 

200 < x, < 255 240 = s; < 255 200(x, — x)? + 2,000(x, — 200) 
xg = 255 200 = s4 = 255 200(255 — x3) 


DUNE 





412 


Mathematical 
Programming 


Thus for stage n onward (n = 1, 2, 3, 4), 
f(Sn Xn) = 200(x, — s,)? + 2,000(x, — rp) 


4 
+ minimum >) [200(%, — x,_,)? + 2,000(x, — rp], 


MEHS255 i=n+1 
because s, = x,_,. Also, 


fx(s,) = min fsp Xp). 
5 


TSXpZ25. 
Hence F(Sns Xn) = 200(x, — s,)? + 2,000(x, — 7,) + FE 


(with f% defined to be zero because costs after stage 4 are irrelevant to the analysis). 
A summary of these basic relationships is given in Fig. 11.5. 
Consequently, the recursive relationship relating the f* functions is 
fils,) = min {200(%, — 5,)? + 2,000, — r + FZE 
TEX nS255 
The dynamic programming approach uses this relationship to identify succes- 


sively these functions—f73(s4), f3(S3), f2(82), f{(255) —and the corresponding min- 
imizing x,- 


SOLUTION PROCEDURE 


Stage 4: Beginning at the last stage (n = 4), we already know that x7 = 255, so 
the necessary results are 






Fid 
200(255 — s4)? 









200 = s, = 255 





Stage 3: For the problem consisting of just the last two stages (n = 3), the recursive 
relationship reduces to 


F3(s3) 


i 


{200(x; — s3)? + 2,000(x, — 200) + f%(x,)} 


200S14<255 


min  {200(%; — s3)? + 2,000(x, — 200) + 200(255 — x;,)*t, 


200=x3=255 


where the possible values of s, are 240 = s, = 255. 


Stage Stage 
n n+1 


State: © *n =y G) 


Value: Ja (Sy X,) 200 n g S + 2,000 n a Ta) Ta Xn) 
= sum 
Figure 11.5 The basic structure for the Local Job Shop problem. 





One way to solve for the value of x, that minimizes f,(5;, x3) for any particular 413 
value of s3 is the graphical approach illustrated in Fig. 11.6. 

However, a faster way is to use calculus. We want to solve for the minimizing 
x3 in terms of s} by considering s, to have some fixed (but unknown) value. Therefore, 
equate to zero the first (partial) derivative of f(s;, x3) with respect to x3, 


Dynamic Programming 


ð 
za Pi» x3) = 400%, — s3) + 2,000 — 400(255 — x,) 
X3 


400(2x, — s4 — 250) 
0, 


l 


+ 250 
which yields xt = = 


Because the second derivative is positive, and because this solution lies in the feasible 
interval for x, (200 = x, = 255) for all possible s} (240 = s, = 255), it is indeed 
the desired minimum. 

Note a key difference between the nature of this solution and those obtained for 
the preceding examples where there were only a few possible states to consider. We 
now have an infinite number of possible states (240 = s, = 255), so it is no longer 
feasible to solve separately for xž for each possible value of s3. Therefore, we instead 
have solved for x3 as a function of the unknown s4. 














Using 

+ 250 i + 250\ 

FEG) = falsa, x2) = 200 (eo z s) + 200 (os = s2) 

+ 2,000 (pe z 200 
200(255 — x)? 
Sum = f3(s3, X3) 
` 200(x3 — 53)? 
Se EENE BN es hen S 
2,000(x; — 200) 
200 S, 5,+250 255 X3 


2 
Figure 11.6 Graphical solution for f%(s3) for the Local Job Shop problem. 
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and reducing this expression algebraically completes the required results. for the two- 
stage problem summarized as follows. 






f KG 3) 





















240 = s, = 255 50(250 — 53)? + 50(260 — s3} + 1,000(s, — 150) 


Stage 2: The three-stage (n = 2) and four-stage problems (n = 1) are solved in a 
similar fashion. Thus for n = 2, 


Falsa, Xp) = 200(x, — s3)? + 2,000(x, — r2) + fF 2x2) 
200(x, — s2)? + 2,000(x, — 240) 


+ 50(250 — x,)? + 50(260 — x)? + 1,000(x, — 150). 


The possible values of s, are 220 = s, = 255 and the feasible region for x, is 240 
= x, = 255. The problem is to find the minimizing value of x, in this region, so that 


k = . 
Fals) = min — falsz, 2). 
240314255 


Setting to zero the partial derivative with respect to x,, 
ð 
ae fo(S2, X2) = 400x, — s2) + 2,000 — 100(250 — x.) — 100(260 — x,) + 1,000 
X 


= 200(3x, — 2s, — 240) 


=0, 
25S, + 24 
yields X = mra 


Because 
32 
=a fa(s2, x2) = 600 > 0, 
0x3 


this value of x, is the desired minimizing value if it is feasible (240 = x, = 255). 
Over the possible s, (220 = s, = 255), this solution actually is feasible only if 240 
S sy = 255. 

Therefore, we still need to solve for the feasible value of x, that minimizes 
f2(52, X2) when 220 = s, < 240. The key to analyzing the behavior of f,(s>, x2) over 
the feasible region for x, again is the partial derivative of f,(s,, x2). When s, < 240, 


ð 
falsa Xp) > 0, for 240 = Xa = 255, 
OX, 


so that x, = 240 is the desired minimizing value. 


The next step is to plug these values of x, into f,(s5, x,) to obtain f3(s,) for 
S2 = 240 and s, < 240. After considerable algebraic: manipulation, the following 
results are obtained. 
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Fa) 
200(240 ~ s,)? + 115,000 





220 < s, < 240 


200 


240 < s, = 255 Sg (2(250 = 59)? + (265 — s? + 306s, — 575)] 


Stage 1: For the four-stage problem (n = 1), 
falsi x) = 200%, — s1)? + 2,000, — r) + f3(x)). 


Because 7, = 220, the feasible region for x, is 220 = x, = 255. The expression for 
f3(x;) will differ in the two portions, 220 = x, = 240 and 240 = x, = 255, of this 
region. Therefore, 


200(x, — sı) + 2,000(x, — 220) + 200(240 — x,)? + 115,000, 
l if 220 = x, = 240 
fis x) = 


r 


20 
200(x, — s1)? + 2,000(x, — 220) + > [2(250 — x,) 
+ (265 — x)? + 30(3x, — 575)], if 240 <x, = 255. 


Considering first the case where 220 = x, = 240, 


ð 
zafi 1) = 400, — s1) + 2,000 — 400(240 — x) 
1 


= 400(2x, — sı ~ 235). 

It is known that s, = 255 (spring employment), so that 

ð 

— fils x) = 800x; — 245) < 0 

Ox, 
for all x, = 240. Therefore, x, = 240 is the minimizing value of f,(s,, x,) over the 
region, 220 = x, = 240. 

When 240 = x, = 255, 


ð 200 
iGo x) = 400; — s1) + 2,000 — == [4250 — x) + 2(265 — xı) — 90] 
7 


400 
= 3 — 3s, — 225). 

Because 

8? 

aati xı) > 0, for all x,, 

ð 

sl ax fe x) = 0, 
which yields He 3s, + 225 


4 
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Because s; = 255, it follows that x, = 247.5 minimizes f,(s,, x,) over the region, 
240 = x, = 255. 

Note that this region, 240 = x, = 255, includes x, = 240, so that f,(s,, 240) 
> fı(sı, 247.5). In the next-to-last paragraph, we found that x, = 240 minimizes 
f ,(s,, xı) over the region 220 = x, = 240. Consequently, we now can conclude 


that x, = 247.5 also minimizes f,(s,, xı) over the entire feasible region, 220 = 
x, = 255. 


Our final calculation is to find f*(s,) for sı = 255 by plugging x, = 247.5 into 
the expression for f,(255, x,) that holds for 240 = x, = 255. Hence 


f*(255) = 200(247.5 — 255)? + 2,000(247.5 — 220) 
200 ; : 
+ => [2250 — 247.5)? + (265 — 247.5)? + 30(742.5 — 575)] 


= 185,000. 


These results are summarized as follows: 





f Gs 9 


255 | 185,000 | 247.5 


Therefore, tracing back through the tables for n = 2, n = 3, and n = 4, 
respectively, and setting s, = x*_, each time, the resulting optimal solution is 
x} = 247.5, xš = 245, x¥ = 247.5, x} = 255, with a total estimated cost per cycle 
of $185,000. 


To conclude our illustrations of deterministic dynamic programming, we give 
one example that requires more than one variable to describe the state at each stage. 


Example 5—Wyndor Glass Company Problem 
Consider the following linear programming problem: 
Maximize Z = 3x, + Sx, 
subject to X s4 
2x, = 12 
3x; + 2x, = 18 
and x, 20, xX, 20. 


(You might recognize this model as the prototype example for linear programming in 
Chap. 3.) One way of solving small linear (or nonlinear) programming problems like 
this one is by dynamic programming, which is illustrated below. 


FORMULATION: This problem requires making two interrelated decisions, namely, 
the level of activity 1, x,, and the level of activity 2, x». Therefore, these two activities 


can be interpreted as the two stages in a dynamic programming formulation. Although 
they can be taken in either order, let stage n = activity n (n = 1, 2). Thus x, is the 
decision variable at stage n. 

What are the states? In other words, given that the decision had been made at 
prior stages (if any), what information is needed about the current state of affairs 
before the decision can be made at stage n? Reflection might suggest that the required 
information is the amount of slack left in the functional constraints. Interpret the right- 
hand side of these constraints (4, 12, and 18) as the total available amount of resources 
1, 2, and 3, respectively (as described in Sec. 3.1). Then the state s„ can be defined 
as 


State s, = amount of the respective resources still available for 
allocation to the remaining activities (n, . . . , 2). 


(Note that the definition of the state is analogous to that for distribution of effort 
problems, including Examples 2 and 3, except that there are now three resources to 
be allocated instead of just one.) Thus 


S, = (Ry, Ro, R3), 
where R; is the amount of resource i remaining to be allocated (i = 1, 2, 3). Therefore, 
1 = (4, 12, 18), 
2 = (4 — x, 12, 18 — 3x,). 


However, when we begin by solving for stage 2, we shall not yet know the value of 
xı, and so use s, = (R,, Rz, R3) at that point. 

Therefore, in contrast to the preceding examples, this problem has three state 
variables (i.e., a state vector with three components) at each stage rather than one. 
From a theoretical standpoint, this difference is not particularly serious. It only means 
that, instead of considering all possible values of the one state variable, we must 
consider all possible combinations of values of the several state variables. However, 
from the standpoint of computational efficiency, this difference tends to be a very 
serious complication. Because the number of combinations, in general, can be as large 
as the product of the number of possible values of the respective variables, the number 
of required calculations tends to ‘‘blow up” rapidly when additional state variables 
are introduced. This phenomenon has been given the apt name of the curse of 
dimensionality. 

Each of the three state variables is continuous. Therefore, rather than consider 
each possible combination of values separately, we must use the approach introduced 
in Example 4 of solving for the required information as a function of the state of the 
system. 

Despite these complications, this problem is small enough that it can still be 
solved without great difficulty. To solve it, we need to introduce the usual dynamic 
programming notation. Thus, 


FAR, Ro, Rs, xX) = contribution of activity 2 to Z if the system starts in state 
(R,, R, R3) at stage 2 and the decision is x, 


= 5x, 
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f,(4, 12,18, xı) 


Il 


contribution of activities. 1 and 2 to Z if the system starts 
in state (4, 12, 18) at stage 1, the immediate decision is 
xı, and then an optimal decision is made at stage 2. 

= 3x, + maximum {5x,}. 

XyS12 


2x4 18 — 3x, 
x30 


Similarly, forn = 1, 2, 


ft 6 ae R,, R3) = max f (Ri, Ro, Ras Xp)» 
Xn 


where this maximum is taken over the feasible values of x,. Consequently, using the 
relevant portions of the constraints of the problem, 


(1) FR, R, R3) = maximum {5x3}, 
2x ER, 
DwER, 
x,=0 


(2) fi(4, 12, 18, x) = 3x, + f3(4 — x, 12, 18 — 3x), 


(3) f7(4, 12, 18) = maximum {3x, + f3(4 — x, 12, 18 — 3x} 
ne 
x,20 


Equation (1) will be used to solve the stage 2 problem. Equation (2) shows the 
basic dynamic programming structure for this problem, as also depicted in Fig. 11.7. 
Equation (3) gives the recursive relationship between f¥ and f% that will be used to 
solve the stage 1 problem. 


SOLUTION PROCEDURE 

Stage 2: To solve at the last stage (n = 2), Eq. (1) indicates that x} must be the 
largest value of x, that simultaneously satisfies 2x, = R,, 2x, = R3, and x, = 0. 
Assuming that R, = 0 and R, = 0, so that feasible solutions exist, this largest value 
is the smaller of R,/2 and R;/2. Thus the solution is 









R, = 0, R; =0 
Stage Stage 
1 2 
State: (C 4,12,18 a > ( 4- x, 12, 18 — 3x, 
Value: fi (4, 12, 18, x) 3x, fZ (4 — x, 12, 18 — 3x) 
= sum 


Figure 11.7 The basic structure for the Wyndor Glass Co. linear programming problem. 


Stage 1: To solve the two-stage problem (n = 1), we plug the solution just obtained 419 
for f3(R,, R», R3) into Eq. (3). For stage 2, Dynamic Programming 
(Ri, Ry, R3) = (4 — x,, 12, 18 — 3x), 


R, R 
so that fž(4 — x,, 12, 18 — 3x,) = 5 min (2 >] = 5 min { 


12 18 — 3x 
2? 2 


2’ 2 
is the specific solution plugged into Eq. (3). After combining its constraints on x,, 
Eq. (3) then becomes 


12 18 — 3 
fi, 12, 18) = max {3 + 5 min { 2i 


0x54 2 i 2 


Over the feasible interval, 0 = x; < 4, notice that 


6, 
min (22, 18 — 3% 
ai uae 


3 
9 = 5% f2sx s4, 


ifO0=x,=2 





3x, + 30, ifO0=x,=2 





12 18 — 3 
so that 3x, + 5 min { a) = 


; 9 
a ae 4-sm, if254, 54. 
Because both 


9 
max {3x, + 30} and max {4s ~ 2} 


Osx1=2 2x4 


achieve their maximum at x, = 2, it follows that x} = 2, and that this maximum is 
36, as given in the following table. 





Because xj = 2 leads to 
R, =4-2=2, R, = 12, R, = 18 — 3(2) = 12 


for stage 2, the n = 2 table yields x} = 6. Consequently, xf = 2, x3 = 6 is the 
optimal solution for this problem (as originally found in Sec. 3.1), and the n = 1 
table shows that the resulting value of Z is 36. 


11.4 Probabilistic Dynamic Programming 


Probabilistic dynamic programming differs from deterministic dynamic programming 
in that the state at the next stage is not completely determined by the state and policy 
decision at the current stage. Rather, there is a probability distribution for what the 
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next state will be. However, this probability distribution still is completely determined 
by the state and policy decision at the current stage. The resulting basic structure for 
probabilistic dynamic programming is described diagrammatically in Fig. 11.8. 

For purposes of this diagram, we let S denote the number of possible states at 
stage n + 1 and label these states on the right-side as 1, 2, . . . , S. The system goes 
to state i with probability p;(@@ = 1, 2,..., S) given the state s, and decision x, at 
stage n. If the system goes to state i, C; is the contribution of stage n to the objective 
function. 

When Fig. 11.8 is expanded to include all the possible states and decisions at 
all the stages, it is sometimes referred to as a decision tree. If the decision tree is 
not too large, it provides a useful way of summarizing the various possibilities that 
may occur. 

Because of the probabilistic structure, the relationship between f,(s,, x,) and 
the f* , ,(s,4) necessarily is somewhat more complicated than for deterministic dy- 
namic programming. The precise form of this relationship will depend upon the form 
of the overall objective function. 

To illustrate, suppose that the objective is to minimize the expected sum of the 
contributions from the individual stages. In this case, f,,(s,,, x,) would represent the 
minimum expected sum from stage n onward, given that the state and policy decision 
at stage n are s, and x,, respectively. Consequently, 


s 
FalSw Xe) = PACE + Fri: 
with HEERO) = min fayil Xn) 


Xn+ 1 


where this minimization is taken over the feasible values of x, ,. 
Example 6 has this same form. Example 7 will illustrate another form. 


Stage n Stage n + 1 






Probability Contribution 
from stage n 


E Saw) 
aa 
State: (Petson p— t 
Paa t . Fa) 
a\Sns Xn, i aa 
Cs 
Fas) 


Figure 11.8 The basic structure for probabilistic dynamic programming. 


Example 6—Determining Reject Allowances 


The HIT-AND-MISS MANUFACTURING COMPANY has received an order to sup- 
ply one item of a particular type. However, the customer has specified such stringent 
quality requirements that the manufacturer may have to produce more than one item 
to obtain an item that is acceptable. The number of extra items produced in a pro- 
duction run is called the reject allowance. Including a reject allowance is common 
practice when producing for a custom order, and it seems advisable in this case. 

The manufacturer estimates that each item of this type that is produced will be 
acceptable with probability $ and defective (without possibility for rework) with prob- 
ability 4. Thus the number of acceptable items produced in a lot of size L will have 
a binomial distribution; that is, the probability of producing zero acceptable items in 
such a lot is 4). 

Marginal production costs for this product are estimated to be $100 per item 
(even if defective), and excess items are worthless. In addition, a setup cost of $300 
must be incurred whenever the production process is set up for this product, and a 
completely new setup at this same cost is required for each subsequent production run 
if a lengthy inspection procedure reveals that a completed lot has not yielded an 
acceptable item. The manufacturer has time to make no more than three production 
runs. If an acceptable item has not been obtained by the end of the third production 
run, the cost to the manufacturer in lost sales income and in penalty costs would be 
$1,600. 

The objective is to determine the policy regarding the lot size (1 + reject 
allowance) for the required production run(s) that minimizes total expected cost for 
the manufacturer. 


FORMULATION: A dynamic programming formulation for this problem is 


Stage n = production run n (n = 1, 2, 3), 
x, = lot size for stage n, 
State s, = number of acceptable items still needed (one or zero) 


when beginning stage n. 


Thus, at stage 1, the state s} = 1. If at least one acceptable item is obtained subse- 
quently, the state changes to s, = 0, after which no additional costs need to be 
incurred. 

Because of the stated objective for the problem, 


f,ASp» Xn) = total expected cost for stages n, ... , 3 if the system starts 
in state s,, at stage n, the immediate decision is x,, and 
optimal decisions are made thereafter, 


min fn X,)s 
an=0,1,... 


Il 


FalSp) 


where fž(0) = 0. Using $100 as the unit of money, the contribution to cost from 
stage n is (K + x,) regardless of the next state, where 


_]0, ifn = 0 
n if x, > 0. 
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K + x, + Of + 11 -— @* 17.0 


K + xn + @ ft.) 


Il 


[where f7(1) is defined to be 16, the terminal cost if no acceptable items have been 
obtained]. A summary of these basic relationships is given in Fig. 11.9. 
Consequently, the recursive relationship for the dynamic programming calcu- 
lations is 
fiO = min {K + x, + Gy" fra i(D} 


xn=0,1,... 


forn = 1, 2, 3. 


SOLUTION PROCEDURE: The calculations using this recursive relationship are sum- 
marized as follows. 
































| AA, x) =K +x + 166) | x) =K+ | AA, x) =K +x + 166) | + 166)" 
n= 3 53 F38) 
0 0 
8 
n=2 
n=1 
Probability | Contribution 
from stage n 
ie (‘jy ta fru(0) = 0 
State: Decision 
x, 
1 n 
Value: AA, x, (5) 
(L, xa) We, 2 me + Xx, 
= K + Xn + a Frail) 
foal) 


Figure 11.9 The basic structure for the Hit-and-Miss Manufacturing Co. problem. 


Thus the optimal policy is to produce two items on the first production run; if 
none is acceptable, then produce either two or three items on the second production 
run; if none is acceptable, then produce either three or four items on the third pro- 
duction run. The total expected cost for this policy is $675. 


Example 7— Winning in Las Vegas 


An enterprising young statistician believes that she has developed a system for winning 
a popular Las Vegas game. Her colleagues do not believe that her system works, so 
they have made a large bet with her that, starting with three chips, she will not have 
five chips after three plays of the game. Each play of the game involves betting any 
desired number of available chips and then either winning or losing this number of 
chips. The statistician believes that her system will give her a probability of å of 
winning a given play of the game. 

Assuming the statistician is correct, we now will use dynamic programming to 
determine her optimal policy regarding how many chips to bet (if any) at each of the 
three plays of the game. The decision at each play should take into account the results 
of earlier plays. The objective is to maximize the probability of winning her bet with 
her colleagues. 


FORMULATION: The dynamic programming formulation for this problem is 


Stage n = nth play of the game (n = 1, 2, 3), 
x, = number of chips to bet at stage n, 
State s, = number of chips in hand to begin stage n. 


This definition of the state is chosen because it provides the needed information about 
the current situation for making an optimal decision on how many chips to bet next. 

Because the objective is to maximize the probability that the statistician will 
win her bet, the objective function to be maximized at each stage must be the prob- 
ability of finishing the three plays with at least five chips. Therefore, 


f,(S,. Xa) = probability of finishing the three plays with at least five chips, 
given that the statistician starts stage n in state s,, makes the 
immediate decision x,, and makes optimal decisions thereafter, 


The expression for f,,(s,, X,) must reflect the fact that it may still be possible 
to accumulate five chips eventually even if the statistician should lose the next play. 
If she should lose, the state at the next stage would be (s, — x,), and the probability 
of finishing with at least five chips would then be f* , ,(s, — x,). If she should win 
the next play instead, the state would become (s, + x,), and the corresponding 
probability would be fž , ,(s, + x„). Because the assumed probability of winning a 
given play is 3, it now follows that 


FS Xn) = ah iy a Xa) F af Gs P Xn) 


[where f3(s,) is defined to be zero for s4 < 5 and 1 for s, = 5]. Thus there is no 
direct contribution to the objective function from stage n in addition to the effect of 
then being in the next state. These basic relationships are summarized in Fig. 11.10. 
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Figure 11.10 The basic structure for the Las Vegas problem. 


Therefore, the recursive relationship for this problem is 


fžCs,) = max {3f (sn aa Xp) + af ailn a2 Xndts 


xn=0,1,...5 Sn 
forn = 1, 2, 3, with f4(s,) as just defined. 


SOLUTION PROCEDURE: This recursive relationship leads to the following com- 
putational results. 



























n=3 
2 (or more) 
1 (or more) 
0 (or = s, — 5) 
n= 2 
0 0 
1 0 
2 $ lor2 
3 3 3 0, 2, or 3 
4 ł 3 $ 1 
5 1 0 (or = s, — 5) 
n=1 Filsp 





Therefore, the optimal policy is 


if win, x3 = 0 


. . oe = 
IB Xe on is lose, x} = 2 or 3. 


Ses cs Faa 
xasi if win, xt = e or 3 (for x3 1) 


if lose, x3 = 1 or 2 1, 2, 3, or 4 (for x3 = 2) 
if lose, bet is lost. 


This policy gives the statistician a probability of 3% of winning her bet with her 
colleagues. 


11.5 Conclusions 
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Dynamic programming is a very useful technique for making a sequence of interrelated 
decisions, It requires formulating an appropriate recursive relationship for each in- 
dividual problem. However, it provides a great computational savings over using 
exhaustive enumeration to find the best combination of decisions, especially for large 
problems. For example, if a problem has 10 stages with 10 states and 10 possible 
decisions at each stage, then exhaustive enumeration must consider up to 10!° com- 
binations, whereas dynamic programming need make no more than 10° calculations 
(10 for each state at each stage). 

This chapter has considered only dynamic programming with a finite number of 
stages. Chapter 20 is devoted to a general kind of model for probabilistic dynamic 
programming where the stages continue to recur indefinitely—namely, Markovian 
decision processes. 
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PROBLEMS 


1. Consider the following network, where each number along a link represents the actual 
distance between the pair of nodes connected by that link. 
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fo (C) =13 


The objective is to find the shortest route from the origin to the destination. 

(a) What are the stages and states for the dynamic programming formulation of this 
problem? 

(b) Use dynamic programming to solve this problem. However, instead of using the 
usual tables, show. your work graphically. In particular, redraw the preceding net- 
work, where the answers already are given for f*(s,,) for four of the nodes; then 
solve for and fill in f3(B) and f*(O). Draw an arrowhead that shows the optimal 
link to take out of each of the latter two nodes. Finally, identify the optimal route 
by following the arrows from node O onward to node T. 

(c) Use dynamic programming to solve this problem by constructing the usual tables 
forn = 3, n = 2, andn = 1. 

(d) Use the shortest-route algorithm presented in Sec. 10.3 to solve this problem. Com- 
pare and contrast this approach with the one in parts (b) and (c). 


2. The sales manager for a publisher of college textbooks has six traveling salespeople 
to assign to three different regions of the country. She has decided that each region should be 
assigned at least one salesperson and that each individual salesperson should be restricted to 
one of the regions, but now she wants to determine how many salespeople should be assigned 
to the respective regions in order to maximize sales. 

The following table gives the estimated increase in sales (in appropriate units) in each 
region if it were allocated various numbers of salespeople: 





Number Region 
of Salespeople 1 2 3 
1 35 21 28 
2 48 42 4l 
3 70 56 63 
4 89 70 75 





(a) Use dynamic programming to solve this problem. Instead of using the usual tables, 
show your work graphically by constructing and filling in a network such as the one 
shown for Prob. 1. Proceed as in part (b) of Prob. 1 by solving for f*(s,,) for each 
node (except the terminal node) and writing its value by the node. Draw an arrow- 
head to show the optimal link (or links in the case of a tie) to take out of each node. 
Finally, identify the resulting optimal route (or routes) through the network and the 
corresponding optimal solution (or solutions). 

(b) Use dynamic programming to solve this problem by constructing the usual tables 
forn = 3,n = 2, adn = 1. 


3.* The owner of a chain of.three grocery stores has purchased five crates of fresh 
strawberries. The estimated probability distribution of potential sales of the strawberries before 


spoilage differs among the three stores. Therefore, the owner wants to know how he should 427 
allocate the five crates to the three stores to maximize expected profit. 

For administrative reasons, the owner does not wish to split crates between stores. 
However, he is willing to distribute zero crates to any of his stores. 

The following table gives the estimated expected profit at each store when it is allocated 
various numbers of crates: 


Dynamic Programming 








Number Stores 

of Crates 1 2 3 
0 0 0 0 
1 5 6 4 
2 9 il 9 
3 14 15 B 
4 17 19 18 
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Use dynamic programming to determine how many of the five crates should be assigned 
to each of the three stores to maximize the total expected profit. 


4. A college student has 7 days remaining before final examinations begin in her four 
courses, and she wants to allocate this study time as effectively as possible. She needs at least 
1 day on each course, and she likes to concentrate on just one course each day, so she wants 
to allocate 1, 2, 3, or 4 days to each course. Having recently taken an operations research 
course, she decides to use dynamic programming to make these allocations to maximize the 
total grade points to be obtained from the four courses. She estimates that the alternative 
allocations for each course would yield the number of grade points shown in the following 
table: 


Estimated Grade Points 


Number of Course 
Study Days 








Solve this problem by dynamic programming. 


5. A company is planning its advertising strategy for next year for its three major 
products. Since the three products are quite different, each advertising effort will focus on a 
single product. In units of millions of dollars, a total of 6 is available for advertising next year, 
where the advertising expenditure for each product must be an integer greater than or equal to 
1. The vice-president for marketing has established the objective, namely, determine how much 
to spend on each product in order to maximize total sales. The following table gives the 
estimated increase in sales (in appropriate units) for the different advertising expenditures: 







Product 





Advertising 
Expenditure 2 3 
4 6 
10 8 9 


Ae UN m 


Use dynamic programming to solve this problem. 
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6. A political campaign is entering its final stage, and polls indicate a very close election. 
One of the candidates has enough funds left to purchase TV time for a total of five prime-time 
commercials on TV stations located in four different areas. Based on polling information, an 
estimate has been made of the number of additional votes that can be won in the different 
broadcasting areas depending upon the number of commercials run. These estimates are given 
in the following table in units of thousands of votes: 








Number of Area 
Commercials 1 2 3 4 
0 0 0 0 0 
1 4 6 5 3 
2 7 8 9 7 
3 9 10 11 12 
4 12 11 10 14 
5 15 R2 9 16 


Use dynamic programming to determine how the five commercials should be distributed 
among the four areas in order to maximize the estimated number of votes won. 


7. A county chairwoman of a certain political party is making plans for an upcoming 
presidential election. She has received the services of six volunteer workers for precinct work, 
and she wants to assign them to four precincts in such a way as to maximize their effectiveness. 
She feels that it would be inefficient to assign a worker to more than one precinct, but she is 
willing to assign no workers to any one of the precincts if they can accomplish more in other 
precincts. 

The following table gives the estimated increase in the number of votes for the party’s 
candidate in each precinct if it were allocated various numbers of workers: 


Number Precinct 
of Workers 2 3 4 
0 0 0 0 
1 7 5 6 
2 il 10 11 
3 16 15 14 
4 18 18 16 
5 20 21 17 
6 21 22 18 





This problem has several optimal solutions for how many of the six workers should be assigned 
to each of the four precincts to maximize the total estimated increase in the plurality of the 
party’s candidate. Use dynamic programming to find all of them so the chairwoman can make 
the final selection based on other factors. 


8. Use dynamic programming to solve the Northern Airplane Co. production scheduling 
problem presented in Sec. 7.1 (see Table 7.7). Assume that production quantities must be 
integer multiples of 5. 


9. Reconsider the Build-Em-Fast Co. problem described in Prob. 21 of Chap. 7 (also 
see Prob. 51 of Chap. 14). Use dynamic programming to solve this problem. 


10.* A company will soon be introducing a new product into a very competitive market 
and is currently planning its marketing strategy. The decision has been made to introduce the 
product in three phases. Phase 1 will feature making a special introductory offer of the product 


to the public at a greatly reduced price to attract first-time buyers. Phase 2 will involve an 
intensive advertising campaign to persuade these first-time buyers to continue purchasing the 
product at a regular price. It is known that another company will be introducing a new com- 
petitive product at about the time phase 2 will end. Therefore, phase 3 will involve a follow- 
up advertising and promotion campaign to try to keep the regular purchasers from switching to 
the competitive product. 

A total of $4 million has been budgeted for this marketing campaign. The problem now 
is to determine how to allocate this money most effectively to the three phases. Let m denote 
the initial share of the market (expressed as a percentage) attained in phase 1, f, the fraction 
of this market share that is retained in phase 2, and f, the fraction of the remaining market 
share that is retained in phase 3. Given the following data, use dynamic programming to 
determine how to allocate the $4 million to maximize the final share of the market for the new 
product, i.e., to maximize mf,f3. 

(a) Assume that the money must be spent in integer multiples of $1 million in each 

phase, where the minimum permissible multiple is 1 for phase 1 and O for phases 
2 and 3. The following table gives the estimated effect of expenditures in each 
phase: 


Millions Effect on 
of Dollars Market Share 


Expended | m fa fa 





0 — 02 0.3 
1 20 04 0.5 
2 30 05 0.6 
3 40 0.6 0.7 
4 50 ë — — 





(b) Now assume that any amount within the total budget can be spent in each phase, 
where the estimated effect of spending an amount x; (in units of millions of dollars) 
in phase i (i = 1, 2, 3) is 


m = 10x, — x} 
fa = 0.40 + 0.10x, 
fa = 0.60 + 0.07x5. 


[Hint: After solving for the f3(s) and f3(s) functions analytically, solve for x* 
graphically.] 


11. Consider an electronic system consisting of four components, each of which must 
function for the system to function. The reliability of the system can be improved by installing 
several parallel units in one or more of the components. The following table gives the probability 
that the respective components will function if they consist of one, two, or three parallel units: 





Number of Probability of Functioning 
Parallel Units | Component 1 Component 2. Component3 Component 4 
1 0.5 0.6 0.7 0.5 
0.6 0.7 0.8 0.7 
3 0.8 0.8 0.9 0.9 





The probability that the system will function is the product of the probabilities that the 
respective components will function. 
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The cost (in hundreds of dollars) of installing one, two, or three parallel units in the 
respective components is given by the following table: 


Number of Cost 
Parallel Units | Component1 Component2 Component3 Component 4 
1 1 2 1 2 
2 2 4 3 3 
3 3 5 4 4 





Because of budget limitations, a maximum of $1,000 can be expended. 
Use dynamic programming to determine how many parallel units should be installed in 
each of the four components to maximize the probability that the system will function. 


12. Consider the following integer nonlinear programming problem. 


Maximize Z = x,x3x3, 


subject to xX, + 2x, + 3x, = 10, 
x, 21, x, 21, x;2 1, 
and Xi» X2, X3 are integers. 


Use dynamic programming to solve this problem. 
13.* Consider the following nonlinear programming problem. 
Maximize Z = 36x, + 9x? — 6x3 + 36x, — 3x3, 
subject to x, +X S3 
and x, 20, x, = 0. 
Use dynamic programming to solve this problem. 

14. Resolve the Local Job Shop employment scheduling problem (Example 4) when the 
total cost of changing the level of employment from one season to the next is changed to $100 
times the square of the difference in employment levels. 

15. Consider the following nonlinear programming problem. 

Maximize  Z = xix», 
subject to xi + x, = 2. 
(Note that there are no nonnegativity constraints.) Use dynamic programming to solve this 
problem. 

16. Consider the following nonlinear programming problem. 

Maximize Z = x} + 4x3 + 16x, 
subject to XXX; = 4 
and x,=1, x, 21, x32 1. 


(a) Solve by dynamic programming when, in addition to the given constraints, all three 
variables also are required to be integer. 
(b) Use dynamic programming to solve the problem as given (continuous variables). 


17. Consider the following nonlinear programming problem. 
Maximize Z = x, — x2)x, 
subject to X, 7X, t+ x35 1 
and x, 20, x, = 0, x,20. 
Use dynamic programming to solve this problem. 
18. Consider the following linear programming problem. 
Maximize Z = 15x, + 10x, 
subject to x, + 2x,=6 
3x, + x, =8 
and x, 20, x,=0. 
Use dynamic programming to solve this problem. 
19. Consider the following nonlinear programming problem. 
Maximize Z = 5x, + Xo, 
subject to 2x? + x = 13 
situs 9 
and x, = 0, x, 20. 
Use dynamic programming to solve this problem. 
20. Consider the following ‘‘fixed-charge’’ problem. 


Maximize Z = 3x, + 7x, + 6f(x3), 


subject to x + 3x, + 2x,56 
Xi + Xa = 5 
and x; = 0, X2 = 0, X3 = 0, 
E 0, if x; = 0 
where fa) = i +x, ifx, > 0. 


Use dynamic programming to solve this problem. 


21. Consider the Food and Agriculture Organization example presented in Sec. 8.3. 
Suppose that large additional amounts of equipment and money become available, so that the 
constraint on experts, x, + 2x, <= 10, is the only relevant functional constraint. After deleting 
the other two functional constraints, use dynamic programming to solve directly the first model 
presented for this example (ignore the equivalent linear programming model) under the follow- 
ing alternative assumptions. 

(a) Assume that x, and x, are required to be integer. 

(b) Assume that the divisibility assumption (see Sec. 3.3) holds, so that the only re- 

strictions on x, and x, are the one functional constraint and the nonnegativity con- 
straints. 


22. A backgammon player will be playing three consecutive matches with friends to- 
night. For each match, he will have the opportunity to place an even bet that he will win; the 
amount bet can be any quantity of his choice between zero and the amount of money he still 
has left after the bets on the preceding matches. For each match, the probability is $ that he 
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will win the match and thus win the amount bet, whereas the probability is 4 that he will lose 
the match and thus lose the amount bet. He will begin with $75, and his goal is to have $100 
at the end. (Because these are friendly matches, he does not want to end up with more than 
$100.) Therefore, he wants to find the optimal betting policy (including all ties) that maximizes 
the probability that he will have exactly $100 after the three matches. 

Use dynamic programming to solve this problem. 


23. Imagine that you have $5,000 to invest and that you will have an opportunity to 
invest that amount in either of two. investments (A or B) at the beginning of each of the next 3 
years. Both investments have uncertain returns. For investment A you will either lose your 
money entirely or (with higher probability) get back $10,000 (a profit of $5,000) at the end of 
the year. For investment B you will either get back just your $5,000 or (with low probability) 
$10,000 at the end of the year. The probabilities for these events are: 


Amount 
Investment | Returned ($) | Probability 
A 0 0.3 
10,000 0.7 


B 5,000 0.9 
10,000 











You are allowed to make only (at most) one investment each year, and can invest only $5,000 
each time. (Any additional money accumulated is left idle.) 
(a) Use dynamic programming to find the investment policy that maximizes the expected 
amount of money you will have after the 3 years. 
(b) Use dynamic programming to find the investment policy that maximizes the prob- 
ability that you will have at least $10,000 after the 3 years. 


24.* Suppose that the situation for the Hit-and-Miss Manufacturing Co. problem (Ex- 
ample 6) has changed somewhat. After a more careful analysis, you now estimate that each 
item produced will be acceptable with probability 3, rather than 3, so that the probability of 
producing zero acceptable items in a lot of size L is ($). Furthermore, there now is only enough 
time available to make two production runs. Use dynamic programming to determine the new 
optimal policy for this problem. 


25. Reconsider Example 7. Suppose that the bet is changed to: ‘‘Starting with two chips, 
she will not have five chips after five plays of the game.’’ By referring to the previous com- 
putational results, make additional calculations to determine what the new optimal policy is for 
the enterprising young statistician. 


26. The Profit & Gambit Co. has a major product that has been losing money recently 
because of declining sales. In fact, during the current quarter of the year, sales will be 4 million 
units below the break-even point. Because the marginal revenue for each unit sold exceeds the 
marginal cost by $5, this amounts to a loss of $20 million for the quarter. Therefore, manage- 
ment must take action quickly to rectify this situation. Two alternative courses of action are 
being considered. One is to abandon the product immediately, incurring a cost of $20 million 
for shutting down. The other alternative is to undertake an intensive advertising campaign to 
increase sales and then abandon the product (at the cost of $20 million) only if the campaign 
is not sufficiently successful. Tentative plans for this advertising campaign have been developed 
and analyzed. It would extend over the next three quarters (subject to early cancellation), and 
the cost would be $30 million in each of the three quarters. It is estimated that the increase in 
sales would be approximately 3 million units in the first quarter, another 2 million units in the 
second quarter, and another 1 million units in the third quarter. However, because of a number 
of unpredictable market variables, there is considerable uncertainty as to what impact the 


advertising actually would have, and careful analysis indicates that the estimates for each quarter 
could turn out to be off by as much as 2 million units in either direction. (To quantify this 
uncertainty, assume that the additional increases in sales in the three quarters are independent 
random variables having a uniform distribution with a range from 1 to 5 million, from 0 to 4 
million, and from —1 to 3 million, respectively.) If the actual increases are too small, the 
advertising campaign can be discontinued and the product abandoned at the end of either of 
the next two quarters. 

If the intensive advertising campaign were to be initiated and continued to its completion, 
it is estimated that the sales for some time thereafter would continue to be at about the same 
level as in the third (last) quarter of the campaign. Therefore, if the sales in that quarter still 
are below the break-even point, the product would be abandoned. Otherwise, it is estimated 
that the expected discounted profit thereafter would be $40 for each unit sold over the break- 
even point in the third quarter. 

Use dynamic programming to determine the optimal policy maximizing expected profit. 
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Life is full of conflict and competition. Numerous examples involving adversaries in 
conflict include parlor games, military battles, political campaigns, advertising and 
marketing campaigns by competing business firms, and so forth. A basic feature in 
many of these situations is that the final outcome depends primarily upon the com- 
bination of strategies selected by the adversaries. Game theory is a mathematical 
theory that deals with the general features of competitive situations like these in a 
formal, abstract way. It places particular emphasis on the decision-making processes 
of the adversaries. 

As briefly surveyed in Sec. 12.6, research on game theory continues to delve 
into rather complicated types of competitive situations. However, the focus in this 
chapter is on the simplest case, called two-person, zero-sum games. As the name 
implies, these games involve only two adversaries or players (who may be armies, 
teams, firms, and so on). They are called zero-sum games because one player wins 
whatever the other one loses, so that the sum of their net winnings is zero. 

After introducing the basic model for two-person, zero-sum games in Sec. 12.1, 
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the following four sections describe and illustrate different approaches to solving such 
games. We then conclude the chapter by mentioning some other kinds of competitive 
situations that are dealt with by other branches of game theory. 


12.1 The Formulation of Two-Person, Zero-Sum Games 


To illustrate the basic characteristics of two-person, zero-sum games, consider the 
game called Odds and Evens. This game consists simply of each player simultaneously 
showing either one finger or two fingers. If the number of fingers matches, so that 
the total number for both players is even, then the player taking Evens (say, player 
I) wins the bet (say $1) from the player taking Odds (player II). If the number does 
not match, player I would pay $1 to player II. Thus each player has two strategies: 
to show either one finger or two fingers. The resulting payoff to player I in dollars is 
shown in the payoff table given in Table 12.1. 
In general, a two-person game is characterized by 


1. The strategies of player I. 
2. The strategies of player II. 
3. The payoff table. 


Before the game begins, each player knows the strategies he or she has available, the 
ones the opponent has available, and the payoff table. The actual play of the game 
consists of the players simultaneously choosing a strategy without knowing the op- 
ponent’s choice. 

A strategy may involve only a simple action, as when showing a certain number 
of fingers in the Odds and Evens game. On the other hand, in more complicated games 
involving a series of moves, a strategy is a predetermined rule that specifies com- 
pletely how one intends to respond to each possible circumstance at each stage of the 
game. For example, a strategy for one side in chess would indicate how to make the 
next move for every possible position on the board, so the total number of possible 
strategies would be astronomical. Applications of game theory normally involve far 
less complicated competitive situations than chess, but the strategies involved can be 
fairly complex. 

The payoff table shows the gain (positive or negative) for player I that would 
result from each combination of strategies for the two players. It is given only for 
player I because the table for player II is just the negative of this one, due to the zero- 
sum nature of the game. 

The entries in the payoff table may be in any units desired, such as dollars, 
provided that they accurately represent the utility to player I of the corresponding 
outcome. However, utility is not necessarily proportional to the amount of money (or 


Table 12.1 Payoff 
Table for the Odds 
and Evens Game 
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any other commodity) when large quantities. are involved. For example, $2 million 
(after taxes) is probably worth much less than twice as much as $1 million to a poor 
person. In other words, given the choice between (1) a 50 percent chance of receiving 
$2 million rather than nothing and (2) being sure of getting $1 million, such an 
individual probably would much prefer the latter. On the other hand, the outcome 
corresponding to an entry of 2 in a payoff table should be “‘worth twice as much’’ to 
player I as the outcome corresponding to an entry of 1. Thus, given the choice, he 
or she should be indifferent between a 50 percent chance of receiving the former 
outcome (rather than nothing) and definitely receiving the latter outcome instead.! 

A primary objective of game theory is the development of rational criteria for 
selecting a strategy. Two key assumptions are made: 


1. Both players are rational. 
2. Both players choose their strategies solely to promote their own welfare (no 
compassion for the opponent). 


Game theory contrasts with decision analysis (see: Chap. 22), where the as- 
sumption is that the decision maker is playing a game with a passive opponent, nature, 
which chooses its strategies in some random fashion. 

We shall develop the standard game theory criteria for choosing strategies by 
means of illustrative examples. In particular, the next section presents a prototype 
example that illustrates the formulation of a two-person, zero-sum game and its so- 
lution in some simple situations. A more complicated variation of this game is then 
carried into Sec. 12.3 to develop a more general criterion. Sections 12.4 and 12.5 
describe a graphical procedure and a linear programming formulation for solving 
such games. 


12.2 Solving Simple Games—A Prototype Example 


Two politicians are running against each other for the U.S. Senate. Campaign plans 
must now be made for the final 2 days, which are expected to be crucial because of 
the closeness of the race. Therefore, both politicians want to spend these days cam- 
paigning in two key cities: Bigtown and Megalopolis. To avoid wasting campaign 
time, they plan to travel at night and spend either one full day in each city or two 
full days in just one of the cities. However, since the necessary arrangements must 
be made in advance, neither politician will learn his (or her)? opponent’s campaign 
schedule until after he has finalized his own. Therefore each politician has asked his 
campaign manager in each of these cities to assess what the impact would be (in terms 
of votes won or lost) from the various possible combinations of days spent there by 
himself and by his opponent. He then wishes to use this information to choose his 
best strategy on how to use these 2 days. 


FORMULATION: To formulate this problem as a two-person, zero-sum game, we 
must identify the two players (obviously the two politicians), the strategies for each 
player, and the payoff table. 


' See Sec. 22.5 for a further discussion of the concept of utility. 


? We use only his or her in our examples and problems for ease of reading; we do not mean to imply that 
only men or women are engaged in the various activities. 


As the problem has been stated, each player has the following three strategies: 


Strategy 1 = spend 1 day in each city. 
Strategy 2 = spend both days in Bigtown. 
Strategy 3 = spend both days in Megalopolis. 


By contrast, the strategies would have been more complicated ones in a different 
situation where each politician would learn where his opponent will spend his first 
day before he finalizes his own plans for his second day. In that case, a typical strategy 
would be: Spend the first day in Bigtown; if the opponent also spends the first day in 
Bigtown, then spend the second day in Bigtown; however, if the opponent spends the 
first day in Megalopolis, then spend the second day in Megalopolis. There would be 
eight such strategies, one for each combination of the two first-day choices, the 
opponent’s two first-day choices, and the two second-day choices. 

Each entry in the payoff table for player I represents the utility to player I (or 
the negative utility to player II) of the outcome resulting from the corresponding 
strategies used by the two players. From the politician’s viewpoint, the objective is 
to win votes, and each additional vote (before learning the outcome of the election) 
is of equal value to him. Therefore, the appropriate entries for the payoff table are 
the total net votes won from the opponent (i.e., the sum of the net vote changes in 
the two cities) resulting from these 2 days of campaigning. This formulation is sum- 
marized in Table 12.2. 

However, we should also point out that this payoff table would not be appro- 
priate if additional information were available to the politicians. In particular, if they 
knew exactly how the populace was planning to vote 2 days before the election, the 
only significance of the data prescribed by Table 12.2 would be to indicate which 
politician would win the election with each combination of strategies. Because the 
ultimate goal is to win the election, and because the size of the plurality is relatively 
inconsequential, the utility entries in the table then should be some positive constant 
(say, +1) when politician I would win and —1 when he would lose. Even if only a 
probability of winning can be determined for each combination of strategies, the 
appropriate entries would be the probability of winning minus the probability of losing 
because they then would represent expected utilities. However, sufficiently accurate 
data to make such determinations usually are not available. 

Using the form given in Table 12.2, three alternative sets of data for the payoff 
table are given here to illustrate how to solve three different kinds of games. 


Variation 1: Given that Table 12.3 is the payoff table for the two politicians (players), 
which strategy should each of them select? 


Table 12.2 Formulation of Payoff Table for the 
Political Campaign Problem 





Total Net Votes Won 
by Politician I 
Strategy A Units of 1,000 Votes) 











Politician II 
Strategy 
1 


Politician I 2 


3 
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Table 12.3 - Payoff Table for 
Variation 1 of the Political 
Campaign Problem 








This situation is a rather special one, where the answer can be obtained just by 
applying the concept of dominated strategies to rule out a succession of inferior 
strategies until only one choice remains. Specifically, a strategy can be eliminated 
strategy that is always at least as good regardless of what the opponent does. 

At the outset, Table 12.3 includes no dominated strategies for player II. How- 
ever, for player I, strategy 3 is dominated by strategy 1 because the latter has larger 
payoffs (1 = 0,2 = 1, 4 = —1) regardless of what player II does. Eliminating 
strategy 3 from further consideration yields the following reduced payoff table: 


1 2 3 
Iji 2 4 
211 0 5 


Because both players are assumed to be rational, player II also can deduce that 
player I has only these two strategies remaining under consideration. Therefore, player 
II now does have a dominated strategy—strategy 3, which is dominated by both 
strategies 1 and 2 because they always have smaller losses (payoffs to player I) 
in this reduced payoff table (for strategy 1: 1 = 4, 1 = 5; for strategy 2: 2 = 4, 
0 = 5). Eliminating this strategy yields 


1 2 
I} 1 2 
2/11 0 


At this point, strategy 2 for player I becomes dominated by strategy 1 because 
the latter is better in column 2 (2 = 0) and equally good in column 1 (1 = 1). 
Eliminating the dominated strategy leads to 


1 2 
Iii 2 
where strategy 2 for player II is dominated by strategy 1 (1 = 2). Consequently, both 
players should select their strategy 1. Player I then will receive a payoff of 1 from 
player II (i.e., politician I will gain 1,000 votes from politician II). 
In general, the payoff to player I when both players play optimally is referred 


to as the value of the game. A game that has a value of zero is said to be a fair 
game. Since this particular game has a value of 1, it is not a fair game. 


The concept of a dominated strategy is a very useful one for reducing the size 
of the payoff table that needs to be considered and, in unusual cases like this one, 
actually identifying the optimal solution for the game. However, most games require 
another approach to at least finish solving, as illustrated by the next two variations of 
the example. 


Variation 2: Now suppose that the current data give Table 12.4 as the payoff table 
for the politicians (players). This game does not have dominated strategies, so it is 
not obvious what the players should do. What line of reasoning does game theory say 
they should use? 

Consider player I. By selecting strategy 1, he could win 6 or he could lose as 
much as 3. However, because player H is rational and thus will protect himself from 
large payoffs to I, it seems probable that playing strategy 1 would result in a loss to 
player I. Similarly, by selecting strategy 3, player I could win 5, but more probably 
his rational opponent would avoid this loss and instead administer him a loss, which 
could be as large as 4. On the other hand, if player I selects strategy 2, he is guaranteed 
not to lose anything, and he could even win something. Therefore, because it provides 
the best guarantee, strategy 2 seems to be a “‘rational’’ choice for player I against 
his rational opponent. 

Now consider player II. He could lose as much as 5 or 6 by using strategy 1 or 
3, but is guaranteed at least breaking even with strategy 2. Therefore, using the same 
reasoning of seeking the best guarantee against a rational opponent, his apparent 
choice is strategy 2. 

If both players choose their strategy 2, the result is that both break even. Thus, 
in this case, neither player improves upon his best guarantee, but both also are forcing 
the opponent into the same position. Even when the opponent deduces a player’s 
strategy, the opponent cannot exploit this information to improve his position. Stale- 
mate. 

The end product of this line of reasoning is that each player should play in such 
a way as to minimize his maximum losses whenever the resulting choice of strategy 
cannot be exploited by the opponent to then improve his position. This so-called 
minimax criterion is a standard criterion proposed by game theory for selecting a 
strategy. In terms of the payoff table, it implies that player I should select the strategy 
whose minimum payoff is largest, whereas player II should choose the one whose 
maximum payoff to player I is the smallest. This criterion is illustrated in Table 12.4, 
where strategy 2 is identified as the ‘‘maximin’’ strategy for player I, and strategy 2 
is the minimax strategy for player II. The resulting payoff of O is the value of the 
game, so this is a fair game. 


Table 12.4 Payoff Table for Variation 2 of the Political 








Campaign Problem 
1 2 3 Minimum 
1 =3 —2 6 -3 
I 2 2 0 2 0 < Maximin value 
3 5 -2 -4 -4 
Maximum: 5 0 6 
T 


Minimax value 
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Notice the interesting fact that the same entry in this payoff table yields both 
the maximin and minimax values. The reason is that this entry is both the minimum 
in its row and the maximum in its column. The position of any such entry is called 
a saddle point. 

The fact that this game possesses a saddle point was actually crucial in deter- 
mining how it should be played. Because of the saddle point, neither player can take 
advantage of the opponent’s strategy to improve his own position. In particular, when 
player II predicts or learns that player I is using strategy 2, player II would only 
increase his losses if he were to change from his original plan of using his strategy 
2. Similarly, player I would only worsen his position if he were to change his plan. 
Thus neither player has any motive to consider changing strategies, either to take 
advantage of his opponent or to prevent the opponent from taking advantage of him. 
Therefore, since this is a stable solution, players I and II should exclusively use their 
maximin and minimax strategies, respectively. 

As the next variation illustrates, some games do not possess a saddle point, in 
which case a more complicated analysis is required. 


Variation 3: Late developments in the campaign result in the final payoff table for 
the two politicians (players) given by Table 12.5. How should this game be played? 

Suppose that both players attempt to apply the minimax criterion in the same 
way as in variation 2. Player I can guarantee that he will lose no more than 2 by 
playing strategy 1. Similarly, player IJ can guarantee that he will lose no more than 
2 by playing strategy 3. 

However, notice that the maximin value (— 2) and the minimax value (2) do 
not coincide in this case. The result is that there is no saddle point. 

What are the resulting consequences if both players plan to use the strategies 
just derived? It can be seen that player I would win 2 from player II, which would 
make player II unhappy. Because player II is rational and can therefore foresee this 
outcome, he would then conclude that he can do much better, actually winning 2 
rather than losing 2, by playing strategy 2 instead. Because player I is also rational, 
he would anticipate this switch and conclude that he can improve considerably, from 
—2 to 4, by changing to strategy 2. Realizing this, player II would then consider 
switching back to strategy 3 to convert a loss of 4 to a gain of 3. This possibility of 
a switch would cause player I to consider again using strategy 1, after which the 
whole cycle would start over again. 

In short, the originally suggested solution (player I to play strategy 1 and player 
II to play strategy 3) is an unstable solution, so it is necessary to develop a more 
satisfactory solution. But what kind of solution should it be? 


Table 12.5 Payoff Table for Variation 3 of the Political 
Campaign Problem 





1 2 3 Minimum 





1 0 od 2 —2 <— Maximin value 
I 2 5 4 -3 -3 
3 2 3 -4 —4 
Maximum: 5 4 2 
1 


Minimax value 


The key fact seems to be that whenever one player’s strategy is predictable, the 
opponent here can take great advantage of this information to improve his position. 
Therefore, an essential feature of a rational plan for playing a game such as this one 
is that neither player should be able to deduce which strategy the other will use. 
Hence, rather than applying some known criterion for determining a single strategy 
that will definitely be used, it is necessary to choose among alternative acceptable 
strategies on some kind of random basis. By doing this, neither player knows 
in advance which of his own strategies will be used, let alone what his opponent 
will do. 

This suggests, in very general terms, the kind of approach that is required for 
games lacking a saddle point. The next section discusses this approach more fully. 
Given this foundation, we turn our attention to procedures for finding an optimal way 
of playing such games. This particular variation of the political campaign problem 
will continue to be used to illustrate these ideas as they are developed. 


12.3 Games with Mixed Strategies 


Whenever a game does not possess a saddle point, game theory advises each player 
to assign a probability distribution over his or her set of strategies. To express this 
mathematically, let 


x; = probability that player I will use strategy i (i = 1,2,...,m), 

y; = probability that player II will use strategy j (j = 1,2,..., 2), 
where m and n are the respective numbers of available strategies. Thus player I would 
specify his plan for playing the game by assigning values to x,, X2, .. . , Xm. Because 


these values are probabilities, they would need to be nonnegative and add up to 1. 
Similarly, the plan for player II would be described by the values he assigns to 
his decision variables y,, Yz... Y, These plans (x,,%,...,4,,) and 
(Yi; Y2, - - - , Yn) are usually referred to as mixed strategies, and the original strate- 
gies would then be called pure strategies. 

When actually playing the game, it is necessary for each player to use one of 
his pure strategies. However, this pure strategy would be chosen by using some 
random device to obtain a random observation from the probability distribution speci- 
fied by the mixed strategy, where this observation would indicate which particular 
pure strategy to use. 

To illustrate, suppose that players I and II in variation 3 of the political campaign 
problem (see Table 12.5) select the mixed strategies (x,, x2, x3) = (%, 3, 0) and 
(Yi Y2, ¥3) = (0, 2, 2), respectively. This selection would say that player I is giving 
an equal chance (probability of 3) to choosing either (pure) strategy 1 or 2, but he is 
discarding strategy 3 entirely. Similarly, player II is randomly choosing between his 
last two pure strategies. To play the game, each player could then flip a coin to 
determine which of his two acceptable pure strategies he will actually use. 

Although no completely satisfactory measure of performance is available for 
evaluating mixed strategies, a very useful one is the expected payoff. Applying the 
probability theory definition of expected value, this quantity is 


m n 


Expected payoff = > 5 PiX j 
ea 


i= 
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where p; is the payoff if player I uses pure strategy i and player II uses pure strategy 
j- In the example of mixed strategies just given, there are four possible payoffs (—2, 
2, 4, —3), each occurring with a probability of 4, so the expected payoff is #(—2 + 
2+ 4 — 3) = 4. Thus this measure of performance does not disclose anything about 
the risks involved in playing the game, but it does indicate what the average payoff 
will tend to be if the game is played many times. 

Using this measure, game theory extends the concept of the minimax criterion 
to games that lack a saddle point and thus need mixed strategies. In this context, the 
minimax criterion says that a given player should select the mixed strategy that 
minimizes the maximum expected loss to himself. Equivalently, when focusing on 
payoffs (player I) rather than losses (player II), this criterion says to maximin instead, 
i.e., maximize the minimum expected payoff to the player. By the minimum expected 
payoff we mean the smallest possible expected payoff that can result from any mixed 
strategy with which the opponent can counter. Thus the mixed strategy for player I 
that is optimal according to this criterion is the one that provides the guarantee 
(minimum expected payoff) that is best (maximal). (The value of this best guarantee 
is the maximin value, denoted by v.) Similarly, the optimal strategy for player H is 
the one that provides the best guarantee, where best now means minimal and guarantee 
refers to the maximum expected loss that can be administered by any of the opponent’s 
mixed strategies. (This best guarantee is the minimax value, denoted by VU.) 

Recall that when only pure strategies were used, games not having a saddle 
point turned out to be unstable (no stable solutions). The reason was essentially that 
u < vV, so that the players would want to change their strategies to improve their 
positions. Similarly, for games with mixed strategies, it is necessary that v = U for 
the optimal solution to be stable. Fortunately, according to the minimax theorem of 
game theory this condition always holds for such games. 


Minimax theorem: If mixed strategies are allowed, the pair of mixed strategies 
that is optimal according to. the minimax criterion provides a stable solution 
with v = U = v (the value of the game), so that neither player can do better 
by unilaterally changing his strategy. 


One proof of this theorem is included in Sec. 12.5. 

Although the concept of mixed strategies becomes quite intuitive if the game is 
played repeatedly, it requires some interpretation when the game is to be played just 
once. In this case, using a mixed strategy still involves selecting and using one pure 
strategy (randomly selected from the specified probability distribution), so it might 
seem more. sensible to ignore this randomization process and just choose the one 
‘best’ pure strategy to be used. However, we have already illustrated for variation 
3.in the preceding section that a player must not allow his opponent to deduce what 
his strategy will be (i.e., the solution procedure under the rules of game theory must 
not definitely identify which pure strategy will be used when the game is unstable). 
Furthermore, even if the opponent is able to use only his knowledge of the tendencies 
of the first player to deduce probabilities (for the pure. strategy chosen) that are 
different from those for the optimal mixed strategy, then he still can take advantage 
of this knowledge to reduce the expected payoff to the first player. Therefore, the 
only way to guarantee attaining the optimal expected payoff v is to randomly select 
the pure strategy to be used from the probability distribution for the optimal mixed 


strategy. (Valid statistical procedures for making such a random selection are discussed 
in Sec. 23.2.) 

Now we must show how to find the optimal mixed strategy for each player. 
There are several methods of doing this. One is a graphical procedure that may be 
used whenever one of the players has only two (undominated) pure strategies; this 
approach is described in the next section. When larger games are involved, the usual 
method is to transform the problem into a linear programming problem that would 
then be solved by the simplex method on a computer; Sec. 12.5 discusses this 
approach. 


12.4 Graphical Solution Procedure 


Consider any game with mixed strategies such that, after eliminating dominated strate- 
gies, one of the players has only two pure strategies. To be specific, let this player 
be player J. Because his mixed strategies are (x,, x.) and x, = 1 — x,, it is necessary 
for him to solve only for the optimal value of x,. However, it is straightforward to 
plot the expected payoff as a function of x, for each of his opponent’s pure strategies. 
This graph can then be used to identify the point that maximizes the minimum expected 
payoff. The opponent’s minimax mixed strategy can also be identified from the graph. 

To illustrate this procedure, consider variation 3 of the political campaign prob- 
lem (see Table 12.5). Notice that the third pure strategy for player I is dominated by 
his second, so the payoff table can be reduced to the form given in Table 12.6. 
Therefore, for each of the pure strategies available to player II, the expected payoff 
for player I will be 


(Yi, Yas X3) Expected Payoff 





(1, 0, 0) Ox, + 5(1 — x) = 5 — 5x, 
(0,1,0) | -2x, +40 — x) = 4 — 6x, 
0, 0, 1) ax, — 3(1 — x) = -3 + 5x, 





Now plot these expected payoff lines on a graph, as shown in Fig. 12.1. For 
any given value of x, and of (y,, Y2, y3), the expected payoff will be the appropriate 
weighted average of the corresponding points on these three lines. In particular, 


Expected payoff = y,(5 — 5x,) + y(4 — 6x,) + y,(—3 + 5x)). 


Table 12.6 Reduced Payoff Table for Variation 3 
of the Political Campaign Problem 








Probability 










xy 
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Maximin point 


Expected payoff 





Figure 12.1 Graphical procedure for solving games. 


Thus, given x,, the minimum expected payoff is given by the corresponding point on 
the ‘‘bottom’’ line. According to the minimax (or maximin) criterion, player I should 
select the value of x, giving the largest minimum expected payoff, so that 
v=v= max {min(—3 + 5x,,4 — 6x,)}. 
0=x,=1 
Therefore, the optimal value of x, is the one at the intersection of the two lines 
(—3 + 5x,) and (4 — 6x,). Solving algebraically, 
—3 + 5x, = 4 — 6x, 
so that x; = 77; thus (x,, x3) = (f, i) is the optimal mixed strategy for player I, 
and 


vev=-34+5@=% 





is the value of the game. 

To find the corresponding optimal mixed strategy for player JI, we now reason 
as follows. According to the definition of the minimax value U and the minimax 
theorem, the expected payoff resulting from the optimal strategy (yi; y2, y3) = 
(yi, y3, y3) Will satisfy the condition 


yi(S — 5x) + y34 — 6x) + yX(-3 + 5x) ST =U HR 


for all values of x, (0 = x, = 1). Furthermore, when player I is playing optimally 
(that is, x, = 74), this inequality will be an equality, so that 





iyi + fry: + fy} =v = Hh. 
Because (y,, Ya, Y3) is a probability distribution, it is also known that 


y +y typ = 1. 


Therefore, y; = 0 because y; > 0 would violate the next-to-last equation; i.e., the 
expected payoff on the graph at x, = 7 would be above the maximin point. (In 
general, any line that does not pass through the maximin point must be given a zero 
weight to avoid increasing the expected payoff above this point.) 


Hence 
f l sů for0<x,<1 
y3(4 — 6x,;) + y3(—3 + 5x) 
SF fox = m. 


But y5 and y3 are numbers, so the left-hand side is the equation of a straight line, 
which is a fixed weighted average of the two ‘“bottom’’ lines on the graph. Because 
the ordinate of this line must equal ñ at x; = 74, and because it must never exceed 
4, the line necessarily is horizontal. (This conclusion is always true unless the optimal 
value of x, is either zero or 1, in which case player I also should use a single pure 
strategy.) Therefore, 


y3(4 — 6x) + yX(-3 + 5x) = 4%, forOSx,=1. 


Hence, to solve for yž and y3, select two values of x, (say, zero and 1), and solve 
the resulting two simultaneous equations. Thus 


a 
ahs 


4y2 ~ 3y3 = 


=2y 5 $ 2y3 z ft, 
so that y = ïr and y} = ï. Therefore, the optimal mixed strategy for player II is 
(Ys Yz Y3) = (0, Ti, D. 

If, in another problem, there should happen to be more than two lines passing 
through the maximin point, so that more than two of the y can be greater than zero, 
this condition would imply that there are many ties for the optimal mixed strategy for 
player H. One such strategy can then be identified by setting all but two of these y 
equal to zero and solving for the remaining two in the manner just described. For the 
remaining two, the associated lines must have positive slope in one case and negative 
slope in the other. 

Although this graphical procedure has been illustrated for only one particular 
problem, essentially the same reasoning can be used to solve any game with mixed 
strategies that has only two undominated pure strategies for one of the players. 


12.5 Solving by Linear Programming 


Any game with mixed strategies can be solved by transforming the problem into a 
linear programming problem. As you will see, this transformation requires little more 
than applying the minimax theorem and using the definitions of the maximin value v 
and minimax value v. 


First, consider how to find the optimal mixed strategy for player I. As indicated 
in Sec. 12.3, 


Expected payoff = 5 Da Px; 
i=l j=l 
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and the strategy (x,, X2, ... > Xn) 18 optimal if 
m n 
> 2 PjXð; = U = V 
i=1 j=l 


for all opposing strategies (y,, y2,.-., Yn). Thus this inequality will need to hold, 
for example, for each of the pure strategies of player II, i.e., for each of the strategies 
(Yis Yz» - - - > Yn) Where one y; = 1 and the rest equal zero. Substituting these values 
into the inequality yields 


m 
> pyx Zv, forj = 1,2,...37, 
i=l 


so that the inequality implies this set of n inequalities. Furthermore, this set of n 
inequalities implies the original inequality (rewritten), 


n 
since 5 y= 


Because the implication goes in both directions, it follows that imposing this set of n 
linear inequalities is equivalent to requiring the original inequality to hold for all 
strategies (yi, Y2» - - - , Yn). But these n inequalities are legitimate linear program- 
ming constraints, as are the additional constraints 


Xt xy test tx, = 1 
x; = 0, fori = 1,2,...,m 


that are required to ensure that the x; are probabilities. Therefore, any solution 
(x1, X2» . . . » Xm) that satisfies this entire set of linear programming constraints is the 
desired optimal mixed strategy. 

Consequently, the problem of finding an optimal mixed strategy has been re- 
duced to finding a feasible solution for a linear programming problem, which can be 
done as described in Chap. 4. The two remaining difficulties are (1) v is unknown 
and (2) the linear programming problem has no objective function. Fortunately, both 
these difficulties can be resolved at one stroke by replacing the unknown constant v 
by the variable x,,,, and then maximizing x,, 4, SO that x,,,, automatically will equal 
vu (by definition) at the optimal solution for the linear programming problem! 

To summarize, player I would find his optimal mixed strategy by using the 
simplex method to solve the linear programming problem: 


Minimize  (—X,41); 
subject to Puy + Pot, + °° + PmXm ~ Xm+1 = 0 
Pixi + Pook. ++ + PmXm — Xm+1 = 0 
Pin*1 + PonX2 el ae Pim*m 7 Xm41 = 0 


=O Pig he eS 


and x; = 0, fori = 1,2,...,m. 


(The objective function and equality constraint have been rewritten here in an equiv- 
alent way for later convenience.) Note that x,,,, is not restricted to be nonnegative, 
whereas the simplex method can be applied only after all the variables have non- 
negativity constraints. However, this matter can be easily rectified, as will be discussed 
shortly. 

Now consider player H. He could find his optimal mixed strategy by rewriting 
the payoff table as the payoff to himself rather than to player I and then by proceeding 
exactly as just described. However, it is enlightening to summarize his formulation 
in terms of the original payoff table. By proceeding in a way that is completely 
analogous to the one just described, player H would conclude that his optimal mixed 
strategy is given by the optimal solution to the linear programming problem: 


Maximize = (—Yn41)s 


subject to Pudi + Pr. + °° + Pinn Yne = 0 
Pay + Poa + °° + Pann — Yne = 0 
Pritt Pada oe OO F Dia T Yaa = O 
=O ieee as be P= 
and y; 2 0, forj=1,2,...,n. 


Notice the key fact that this linear programming problem and the one given for 
player I are dual to each other in the sense described in Secs. 6.1 and 6.4. (In 
particular, this problem is in the form given for the primal problem, and the one for 
player I is the corresponding dual problem.) This fact has several important impli- 
cations. One implication is that the optimal mixed strategies for both players can be 
found by solving only one of the linear programming problems because the optimal 
dual solution is an automatic by-product of the simplex method calculations to find 
the optimal primal solution. A second implication is that this brings all duality theory 
(described in Chap. 6) to bear upon the interpretation and analysis of games. 

A related implication is that this provides a simple proof of the minimax theo- 
rem. Let x* ,, and y* ,, denote the value of x,,,.; and y, 4 in the optimal solution of 
the respective linear programming problems. It is known from the strong duality 
property given in Sec. 6.1 that —xž 4; = —y*,,, so thatx*,, = y%,,. However, 
it is evident from the definition of v and ọ that v = x*,, and U = y*,,, so it follows 
that v = U as claimed by the minimax theorem. 

The objective functions of these two linear programming problems have been 
written as minimize (~x,,4.,) and maximize (— y,.,,) simply to demonstrate that the 
two problems are dual to each other. Hereafter, these objective functions will be 
written in the more natural equivalent forms, maximize x,,,, and minimize y,, ,. For 
the same reason, the negative sign now should be deleted from both sides of the 
equality constraint in each problem. 

One remaining loose end needs to be tied up, namely, what to do about x,,., 
and y,., being unrestricted in sign in the linear programming formulations. If it is 
clear that v = 0 so that the optimal values of x,,,, and y,,, are nonnegative, then it 
is safe to introduce nonnegativity constraints for these variables for purposes of ap- 
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plying the simplex method. However, if v < 0, then an adjustment needs to be made. 
One possibility is to use the approach described in Sec. 4.6 for replacing a variable 
without a nonnegativity constraint by the difference of two nonnegative variables. 
Another is to reverse players I and II so that the payoff table would be rewritten as 
the payoff to the original player IJ, which would make the corresponding value of v 
positive. A third, and the most commonly used, procedure is to add a sufficiently 
large fixed constant to all the entries in the payoff table so that the new value of the 
game will be positive. (For example, setting this constant equal to the absolute value 
of the largest negative entry will suffice.) Because this same constant is added to 
every entry, this adjustment cannot alter the optimal mixed strategies in any way, so 
they can now be obtained in the usual manner. The indicated value of the game would 
be increased by the amount of the constant, but this value can be readjusted after the 
solution has been obtained. 

To illustrate this linear programming approach, consider again variation 3 of 
the political campaign problem after eliminating dominated strategy 3 for player I (see 
Table 12.6). Because there are some negative entries in the reduced payoff table, it 
is unclear at the outset whether the value of the game v is nonnegative (it turns out 
to be). For the moment, let us assume that v = 0 and proceed without making any 
of the adjustments discussed in the preceding paragraph. 

To write out the linear programming model for player I for this example, note 
that py in the general model is the entry in row 7 and column j of Table 12.6, for 
i= 1,2 andj = 1, 2, 3. Using maximize x,,,, instead of the equivalent minimize 
(—Xn4 1), With m = 2 and n = 3, the resulting model is 


Maximize x3, 


subject to 5x, — x32 0 
—2x, + 4x, — x, 2 0 
2x, ~ 3x, — x3 = 0 
xX) + XX = 1 
and x 2 0, xX, = 0, x, 20. 


Applying me simpięx method to- this linear programming problem yields 
xX = 4%, x5 = Tt, x3 = f as the optimal solution. (See Probs. 16 and 17.) Con- 
sequently, the optimal mixed strategy for player I according to the minimax criterion 
is (x1, x) = (4, T), and the value of the game is v = x4 = 7. The simplex method 
also melds. the opema polation to wg dual (given next) of this problem, namely, 
y = 0, y =, y3 = 4, yi = ñ, so the optimal mixed strategy for player II is 
(Yi Yoo Ys) = (0, Pa, 1D- 

The dual of the preceding problem is just the linear programming model for 
player II (the one with variables y,, y., . . - s Yn» Yn+1) Shown earlier in this section. 
Plugging in the values of p; from Table 12.6, this model (in minimization form) is 


Minimize Ya 


| 
© 


subject to — 2y, + 2y3 — y, S 


lA 
° 


Sy; + 4y. — 3y3 — Ya 


Yt yt » =1 


and y=0, y=0, y3=0, ywy=O. 


Applying the simplex method ey to this model yields the optimal sortio: 
yi = 0, o Sas Ys = Ti, y; = ñ (as well as the optimal dual solution, x} = 
ut, % = 7,x3 = 4). Thus the optimal mixed strategy for player His Oi Yo, Y3) = 
0, % r, &D, and the value of the game is again seen to be v = y= = å. 

Because we already had found the optimal mixed strategy for player H while 
dealing with the first model, we did not have to solve the second one. In general, you 
always can find the optimal mixed strategies for both players by choosing just one of 
the models (either one) and then using the simplex method to solve for both the 
optimal solution and the optimal dual solution. 

Both of these linear programming models assumed that v = 0. If this assumption 
were violated, what would happen is that both models would have no feasible solu- 
tions, so the simplex method would stop quickly with this message. To avoid this 
risk, we could have added a positive constant, say 3 (the absolute value of the largest 
negative entry); to all of the entries in Table 12.6. This then would increase by 3 all 
of the coefficients of x,, x2, Y}, Y2, and y, in the inequality constraints of the two 
models. (See Prob. 13.) 





12.6 Extensions 
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Although this chapter has considered only two-person, zero-sum games with a finite 
number of pure strategies, game theory extends far beyond this kind of game. In fact, 
extensive research has been done on a number of more complicated types of games, 
including the ones summarized in this section. 

One such type is the n-person game, where more than two players may par- 
ticipate in the game. This generalization is particularly important because, in many 
kinds of competitive situations, there frequently are more than two competitors in- 
volved. This may occur, for example, in competition among business firms, in inter- 
national diplomacy, and so forth. Unfortunately, the existing theory for such games 
is less satisfactory than it is for two-person games. 

Another generalization is the nonzero-sum game, where the sum of the payoffs 
to the players need not be zero (or any other fixed constant). This case reflects the 
fact that many competitive situations include noncompetitive aspects that contribute 
to the mutual advantage or mutual disadvantage of the players. For example, the 
advertising strategies of competing companies can affect not only how they will split 
the market but also the total size of the market for their competing products. 

Because mutual gain is possible, nonzero-sum games are further classified in 
terms of the degree to which the players are permitted to cooperate. At one extreme 
is the noncooperative game, where there is no preplay communication between the 
players. At the other extreme is the cooperative game, where preplay discussions and 
binding agreements are permitted. For example, competitive situations involving trade 
regulations between countries, or collective bargaining between labor and manage- 
ment, might be formulated as cooperative games. When there are more than two 
players, cooperative games also allow some or all of the players to form coalitions. 

Still another extension is to the class of infinite games, where the players have 
an infinite number of pure strategies available to them. These games are designed for 
the kind of situation where the strategy to be selected can be represented by a contin- 
uous decision variable. For example, this decision variable might be the time at which 
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to take a certain action; or the proportion of one’s resources to allocate to a certain 
activity, in a competitive situation. Much research has been concentrated on such 
games in recent years. 

However, the analysis required in these extensions beyond the two-person, zero- 
sum, finite game is relatively complex and will not be pursued further here. 


12.7 Conclusions 


The general problem of how to make decisions in a competitive environment is a very 
common and important one. The fundamental contribution of game theory is that it 
provides a basic conceptual framework for formulating and analyzing such problems 
in simple situations. However, there is a considerable gap between what the theory 
can handle and the complexity of most competitive situations arising in practice. 
Therefore, the conceptual tools of game theory usually play just a supplementary role 
in dealing with these situations. 

Because of the importance of the general problem, research is continuing with 
some success to extend the theory to more complex situations. 
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PROBLEMS 


1.* For each of the following payoff tables, determine the optimal strategy for each 
player by successively eliminating dominated strategies. (Indicate the order in which you elimi- 
nated strategies.) 


I 
1 2 3 

(a) 
J} - 1 2 
I 2 1 2 1 
3 1 0 -2 





2. Consider the game having the following payoff table. 451 
Game Theory 








Determine the optimal strategy for each player by successively eliminating dominated strategies. 
Give a list of the dominated strategies (and the corresponding dominating strategies) in the 
order in which you were able to eliminate them. 


3. Find the saddle point for the game having the following payoff table. 











5. Two companies share the bulk of the market for a particular kind of product. Each 
is now planning its new marketing plans for the next year in an attempt to wrest some sales 
away from the other company. (The total sales for the product are relatively fixed, so one 
company can only increase its sales by winning them away from the other.) Each company is 
considering three possibilities: (1) better packaging of the product, (2) increased advertising, 
and (3) a slight reduction in price. The costs of the three alternatives are quite comparable and 
sufficiently large that each company will select just one. The estimated effect of each combi- 
nation of alternatives on the increased percentage of the sales for company I is 








Each company must make its selection before learning the decision of the other company. 
(a) Without eliminating dominated strategies, use the minimax (or maximin) criterion 
to determine the best strategy for each side. 
(b) Now identify and eliminate dominated strategies as far as possible. Make a list of 
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the dominated strategies showing the order in which you were able to eliminate 
them. Then show the resulting reduced payoff table with no remaining dominated 
strategies. 


6. The labor union and management of a particular company have been negotiating a 
new labor contract. However, negotiations have now come to an impasse, with management 
making a ‘‘final’’ offer of a wage increase of $1.10/hour and the union making a ‘‘final’’ 
demand of a $1.60/hour increase. Therefore, both sides have agreed to have an impartial 
arbitrator set the wage increase somewhere between $1.10/hour and $1.60/hour (inclusively). 

The arbitrator has asked each side to submit to her a confidential proposal for a fair and 
economically reasonable wage increase (rounded to the nearest dime). From past experience, 
both sides know that this arbitrator normally accepts the proposal of the side that gives the most 
from its ‘‘final’’ figure. If neither side changes its final figure, or if they both give in the same 
amount, then the arbitrator normally compromises halfway between ($1.35 in this case). Each 
side now needs to determine what wage increase to propose for its own maximum advantage. 

(a) Formulate this problem as a two-person, zero-sum game. 

(b) Use the concept of dominated strategies to determine the best strategy for each side. 

(c) Without eliminating dominated strategies, use the minimax criterion to determine 

the best strategy for each side. 


7.* Two politicians soon will be starting their campaigns against each other for a certain 
political office. Each must now select the main issue he will emphasize as the theme of his 
campaign. Each has three advantageous issues from which to choose, but the relative effec- 
tiveness of each one would depend upon the issue chosen by his opponent. In particular, the 
estimated increase in the vote for politician I (expressed as a percentage of the total vote) 
resulting from each combination of issues is 











Issue for 
Politician I 
T 1 2 3 
1 7 =] 3 
ey 2| 1 0 2 
politician? 3| 5 _3 1 





However, because considerable staff work is required to research and formulate the issue 
chosen, each politician must make his own choice before learning his opponent’s choice. Which 
issue should he choose? 

For each of the situations described here, formulate this problem as a two-person, zero- 
sum game, and then determine which issue should be chosen. by each politician according to 
the specified criterion. 

(a) The current preferences of the voters are very uncertain, so each additional percent 
of votes won by one of the politicians has the same value to him. Use the minimax 
criterion. 

(b) A reliable poll has found that the percentage of the voters currently preferring 
politician I (before the issues have been raised) lies between 45 and 50 percent. 
(Assume a uniform distribution over this range.) Use the concept of dominated 
strategies, beginning with the strategies for politician I. 

(c) Suppose that the percentage described in part (b) actually were 45 percent. Should 
politician I use the minimax criterion? Explain. Which issue would you recommend? 
Why? 


8. Two manufacturers currently are competing for sales in two different but equally 
profitable product lines. In both cases the sales volume for manufacturer II is three times as 


large as that for manufacturer I. Because of a recent technological breakthrough, both manu- 
facturers will be making a major improvement in both products. However, they are uncertain 
as to what development and marketing strategy they should follow. 

If both product improvements are developed simultaneously, either manufacturer can 
have them ready for sale in 12 months. Another alternative is to have a ‘‘crash program’’ to 
develop only one product first to try to get it marketed ahead of the competition. By doing this, 
manufacturer I] could have one product ready for sale in 9 months, whereas manufacturer I 
would require 10 months (because of previous commitments for its production facilities). For 
either manufacturer, the second product could then be ready for sale in an additional 9 months. 

For either product line, if both manufacturers market their improved models simultane- 
ously, it is estimated that manufacturer I would increase its share of the total future sales of 
this product by 8 percent of the total (from 25 to 33 percent). Similarly, manufacturer I would 
increase its share by 20, 30, and 40 percent of the total if it markets the product sooner than 
manufacturer II by 2, 6, and 8 months, respectively. On the other hand, manufacturer I would 
lose 4, 10, 12, and 14 percent of the total if manufacturer II markets it sooner by 1, 3, 7, and 
10 months, respectively. 

Formulate this problem as a two-person, zero-sum game, and then determine which 
strategy the respective manufacturers should use according to the minimax criterion. 


9. Consider the following parlor game to be played between two players. Each player 
begins with three chips: one red, one white, and one blue. Each chip can be used only once. 

To begin, each player selects one of his chips and places it on the table, concealed. Both 
players then uncover the chips and determine the payoff to the winning player. In particular, 
if both players play the same kind of chip, it is a draw; otherwise, the following table indicates 
the winner and how much he receives from the other player. Next, each player selects one of 
his two remaining chips and repeats the procedure, resulting in another payoff according to the 
following table. Finally, each player plays his one remaining chip, resulting in the third and 
final payoff. 


Winning Chip Payoff 





Red beats White $50 
White beats Blue $40 
Blue beats Red $30 
Matching colors 0 





Formulate this problem as a two-person, zero-sum game by identifying the form of the strategies 
and payoffs. 


10. Consider the game having the following payoff table. 





Use the graphical procedure described in Sec. 12.4 to determine the value of the game and the 


optimal mixed strategy for each player according to the minimax criterion. Check your answer - 


for player II by constructing his payoff table and applying the graphical procedure directly to 
this table. 


11.* For each of the following payoff tables, use the graphical procedure described in 
Sec. 12.4 to determine the value of the game and the optimal mixed strategy for each player 
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according to the minimax criterion: 





12. Consider the following parlor game between two players. It begins when a referee 
flips a coin, notes whether it comes up heads or tails, and then shows this result to player I 
only. Player I may then either (1) pass and thereby pay $5 to player II, or (2) he may bet. If 
player I passes, the game is terminated. However, if he bets, the game continues, in which 
case player II may then either (1) pass and thereby pay $5 to player I, or (2) he may call. If 
player II calls, the referee then shows him the coin; if it came up heads, player II pays $10 to 
player I; if it came up tails, player II receives $10 from player I. 


(a) Give the pure strategies for each player. (Hint: Player I will have four pure strategies, 
each one specifying how he would respond to each:of the two results the referee 
can show him; player II will have two pure strategies, each one specifying how he 
will respond if player I bets.) 

(b) Develop the payoff table for this game, using expected values for the entries when 
necessary. Determine whether it has a saddle point:or not. 

(c) Use the graphical procedure described in Sec. 12.4 to determine the optimal mixed 
strategy for each player according to the minimax criterion. Also give the corre- 
sponding value of the game. 


13. Referring to the last paragraph of Sec. 12.5, suppose that 3 were added to all of 
the entries of Table 12.6 in order to ensure that the corresponding linear programming models 
for both players have feasible solutions with x, = 0 and y, = 0. Write out these two models. 
Based on the information given in Sec. 12.5, what are the optimal solutions for these two 
models? What is the relationship between x} and y4? What is the relationship between the value 
of the original game v and the values of x3 and y3? 


14.* Consider the game having the following payoff table. 





Use the approach described in Sec. 12.5 to formulate the problem of finding the optimal mixed 
strategies according to the minimax criterion as a linear programming problem. 


15. For each of the following payoff tables, transform the problem of finding the mini- 
max mixed strategies into an equivalent linear programming problem. 





it 
(a) 2 3 & 
I 2 -3 
I 2 0 3 
3 





16. Consider variation 3 of the political campaign problem (see Table 12.6). Refer to 
the resulting linear programming model for player I given near the end of Sec. 12.5. Ignoring 
the objective function variable (x3), plot the feasible region for x, and x, graphically (as de- 
scribed in Sec. 3.1). (Hint: This feasible region consists of a single line segment.) Next, write 
an algebraic expression for the maximizing value of x, for any point in this feasible region. 
Finally, use this expression to demonstrate that the optimal solution must, in fact, be the one 
given in Sec. 12.5. 


17. Consider the linear programming model for player I given near the end of Sec. 
12.5 for variation 3 of the political campaign problem (see Table 12.6). Verify the optimal 
mixed strategies for both players given in Sec. 12.5 by applying the automatic routine for the 
simplex method in your OR COURSEWARE to this model to find both its optimal solution 
and its optimal dual solution. 


18. The A. J. Swim Team soon will have an important swim meet with the G. N. Swim 
Team. Each team has a star swimmer (John and Mark, respectively) who can swim very well 
in the 100-yard butterfly, backstroke, and breaststroke events. However, the rules prevent them 
from being used in more than two of these events. Therefore, their coaches now need to decide 
how to use them to maximum advantage. 

Each team will enter three swimmers per event (the maximum allowed). For each event, 
the following table gives the best time previously achieved by John and Mark as well as the 
best time for each of the other swimmers who will definitely enter that event. (Whichever event 
John or Mark does not swim, his team’s third entry for that event will be slower than the two 
shown in the table.) 


7 _A. J. Swim Team 


Entry 
2 
1:01.6 59.1 57.5 


1:06.8 1:05.6  1:03.3 
1:13.9  1:12.5  1:04.7 





G. N. Swim Team 
Entry 
Mark 1 2 


58.4 1:03.2 59.8 
1:02.6 1:04.9 = 1:04.1 
1:15.3 1:11.8 
























Fly 
Back 





The points awarded are 5 points for first place, 3 for second place, 1 for third place, and 
none for lower places. Both coaches believe that all swimmers will essentially equal their best 
times in this meet. Thus John and Mark each will definitely be entered in two of these three 
events. 

(a) The coaches must submit all their entries before the meet without knowing the entries 
for the other team, and no changes are permitted later. The outcome of the meet is 
very uncertain, so each additional point has equal value for the coaches. Formulate 
this problem as a two-person, zero-sum game. Eliminate dominated strategies, and 
then use the graphical procedure described in Sec. 12.4 to find the optimal mixed 
strategy for each team according to the minimax criterion. 

(b) The situation and assignment are the same as in part (a), except that both coaches 
now believe that the A. J. Swim Team will win the swim meet if it can win 13 or 
more points in these three events, but will lose with less than 13 points. [Compare 
the resulting optimal mixed strategies with those obtained in part (a).] 

(c) Now suppose that the coaches submit their entries during the meet one event at a 
time. When submitting his entries for an event, the coach does not know who will 
be swimming that event for the other team, but he does know who has swum 
preceding events. The three key events just discussed are swum in the order listed 
in the table. Once again, the A. J. Swim Team needs 13 points in these events to 
win the swim meet. Formulate this problem as a two-person, zero-sum game. Then 
use the concept of dominated strategies to determine the best strategy for the G. N. 
team that actually “‘guarantees’’ it will win under the assumptions being made. 
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(d) The situation. is the same as in part (c). However, assume now that the coach for 


19. 


the G. N. team does not know about game theory and so may, in fact, choose any 
of his available strategies that have- Mark swimming two events. Use. the concept 
of dominated strategies to determine the best strategies from which the coach for 
the A. J. team should choose. If this coach knows that the other coach has a tendency 
to enter Mark in the butterfly and the backstroke more often than in the breaststroke, 
which strategy should he choose? 


Consider the general m X n, two-person, zero-sum game. Let p; denote the payoff 


to player I if he plays his strategy i (i = 1, . . . , m) and player II plays his strategy j (j = 
1,..., n). Strategy 1 (say) for player I is said to be weakly dominated by strategy 2 (say) if 
Py = Pz forj = 1,..., nand p; = py for one or more values of j. 


(a) 


(6) 


20. 


criterion. 


Assume that the payoff table possesses one or more saddle points, so that the players 
have corresponding optimal pure strategies under the minimax criterion. Prove that 
eliminating weakly dominated strategies from the payoff table cannot eliminate all 
these saddle points and cannot produce any new ones. 

Assume that the payoff table does not possess any saddle points, so that the optimal 
strategies under the minimax criterion are mixed strategies. Prove that eliminating 
weakly dominated pure strategies from the payoff table cannot eliminate all optimal 
mixed strategies and cannot produce any new ones. 


Briefly describe what you feel are the advantages and disadvantages of the minimax 


Integer Programming 


In Part 2 you saw several examples of the numerous diverse applications of linear 
programming. However, one key limitation that prevents many more applications is 
the assumption of divisibility (see Sec. 3.3), which requires that noninteger values be 
permissible for decision variables. In many practical problems, the decision variables 
actually make sense only if they have integer values. For example, it is often necessary 
to assign people, machines, and vehicles to activities in integer quantities. If requiring 
integer values is the only way in which a problem deviates from a linear programming 
formulation, then it is an integer programming (IP) problem. (The more complete 
name is integer linear programming, but the adjective linear normally is dropped 
except when contrasting this problem to the more esoteric integer nonlinear program- 
ming problem, which is beyond the scope of this book.) 

The mathematical model for integer programming is simply the linear program- 
ming model (see Sec. 3.2) with the one additional restriction that the variables must 
have integer values. If only some of the variables are required to have integer values 
(so the divisibility assumption holds for the rest), this model is referred to as mixed 
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integer programming (MIP). When distinguishing the all-integer problem from this 
mixed case, we call the former pure integer programming. 

For example, the Wyndor Glass Co. problem presented in Sec. 3.1 actually 
would have been an IP problem if the two decision variables, x, and x,, had repre- 
sented the total number of units to be produced of products 1 and 2, respectively, 
instead of the production rates. Because both products (glass doors and wood-framed 
windows) necessarily come in whole units, x, and x, would have to be restricted to 
integer values. 

There have been numerous such applications of integer programming that in- 
volve a direct extension of linear programming where the divisibility assumption must 
be dropped. However, another area of application may be of even greater importance, 
namely, problems involving a number of interrelated “‘yes-or-no decisions.’’ In such 
decisions, the only two possible choices are yes or no. For example, shculd we 
undertake a particular fixed project? Should we make a particular fixed investment? 
Should we locate a facility in a particular site? 

With just two choices, we can represent such decisions by decision variables 
that are restricted to just two values, say zero and one. Thus the jth yes-or-no decision 
would be represented by, say, X; such that 


1, if decision j is yes 
x, = 3 Jot Ga 
J 0, if decision j is no. 


Such variables are called binary variables (or 0—1 variables). Consequently, IP 
problems that contain only binary variables sometimes are called binary integer 
programming (BIP) problems (or 0-1 integer programming problems). 

Section 13.1 presents a miniature version of a typical BIP problem. Additional 
formulation possibilities with binary variables are discussed in Sec. 13.2. The re- 
maining sections then deal with ways to solve JP problems, including both BIP and 
MIP problems. 


13.1 Prototype Example 


The CALIFORNIA MANUFACTURING COMPANY is considering expansion by 
building a new factory in either Los Angeles or San Francisco, or perhaps even in 
both cities. It also is considering building at most one new warehouse, but the choice 
of the location is restricted to a city where a new factory is being built. The net present 
value (total profitability considering the time value of money) of each of these alter- 
natives is shown in the fourth column of Table 13.1. The last column gives the capital 
required for the respective investments, where the total capital available is 
$10,000,000. The objective is to find the feasible combination of alternatives that 
maximizes the total net present value. 

Although this problem is small enough that it can be solved very quickly by 
inspection (build factories in both cities but no warehouse), let us formulate the IP 
model for illustrative purposes. All the decision variables have the binary form, 


1, if decision j is yes 


a {a if decision j is no Gb AS, 


Table 13.1 Data for California Manufacturing Co. Example 



















Net Present 
Value 


Decision 
Variable 


Yes-or-No 
Question 


Build factory in L.A.? 


Decision 
Number 


Capital 
Required 


$6 million 


















x $9 million 







2 Build factory in S.F.? Xa $5 million $3 million 
3 Build warehouse in L.A.? X3 $6 million $5 million 
4 Build warehouse in S.F.? X4 $4 million $2 million 


Capital available: $10 million 


Because the last two decisions represent mutually exclusive alternatives (the company 
wants at most one new warehouse), we need the constraint 


x3 + x,= 1. 


Furthermore, decisions 3 and 4 are contingent decisions, because they are contingent 
on decisions 1 and 2, respectively (the company would consider building a warehouse 
in a city only if a new factory also is going there). This contingency is taken into 
account by the constraints 


X3 — x, =0 
X47 x =0, 


which force x, = 0 if x, = 0 and x, = 0 if x, = 0. Therefore, the complete BIP 
model is 


Maximize Z = 9x, + 5x, + 6x3 + 4x4, 


subject to 6x, + 3x, + 5x3 + 2x, = 10 
x3 + 4s | 
=x] + X3 = 0 
Xp +x,= 0 
x „s 1 
x,= 0 
and x; is an integer, forj = 1, 2, 3, 4. 


Equivalently, the last three lines of this model can be replaced by the single restriction 
x; is binary, forj = 1, 2,3, 4. 


Except for its small size, this example is typical of many real applications of 
integer programming where the basic decisions to be made are of the yes-or-no type. 
Like the second pair of decisions for this example, groups of yes-or-no decisions often 
constitute groups of mutually exclusive alternatives such that only one decision in 
the group can be yes. Each group requires a constraint that the sum of the correspond- 
ing binary variables must be = 1 (if exactly one decision in the group must be yes) 
or = 1 (if at most one decision in the group can be yes). Occasionally, decisions of 
the yes-or-no type are contingent decisions; i.e., decisions depend upon previous 
decisions. In particular, one decision is said to be contingent on another decision if 
it is allowed to be yes only if the other is yes. This situation occurs when the contingent 
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460 decision involves a follow-up action that would become irrelevant, or even impossible, 
Mathematical if the other decision is no. The form that the resulting constraint takes always is that 
Programming illustrated by the third and fourth constraints in the example. 


13.2 Some Other Formulation Possibilities 
with Binary Variables 


You have just seen a prototype example where the basic decisions of the problem are 
of the yes-or-no type, so that binary variables are introduced to represent these de- 
cisions. In addition, binary variables also can be very useful in other ways for for- 
mulating difficult problems in a tractable manner. In particular, these variables some- 
times enable us to take a problem whose natural formulation is intractable and 
reformulate it as a pure or mixed IP problem. 

This kind of situation arises when the original formulation of the problem fits 
either an IP or a linear programming format except for certain minor disparities in- 
volving combinatorial relationships in the model. By expressing these combinatorial 
relationships in terms of questions that must be answered yes or no, auxiliary binary 
variables can be introduced into the model to represent these yes-or-no decisions. 
Introducing these variables reduces the problem to an MIP problem (or a pure IP 
problem if all of the original variables also are required to have integer values). 

Some cases that can be handled by this approach are discussed next, where the 
x; denote the original variables of the problem (they may be either continuous or 
integer variables), and the y; denote the auxiliary binary variables that are introduced 
for the reformulation. 


Either-Or Constraints 


Consider the important case where a choice can be made between two constraints, so 
that only one must hold. For example, there may be a choice as to which of two 
resources to use for a certain purpose, so that it is necessary for only one of the two 
resource availability constraints to hold mathematically. To illustrate the approach to 
such situations, suppose that one of the requirements in the overall problem is that 


Either 3x, + 2x, = 18 
or x, + 4x, = 16. 


This requirement must be reformulated to fit it into the linear programming format 
where all specified constraints must hold. Let M be an extremely large positive num- 
ber. Then this requirement can be rewritten as 


3x, + 2x, = 18 
either fa x, +4, = IGFM 
5 3x, + 2x) S18 +M 
and x, + 4x, = 16, 


because adding M to the right-hand side of such constraints has the effect of elimi- 
nating them, because they would be satisfied automatically by any solutions that satisfy 


the other constraints of the problem. (This formulation assumes that the set of feasible 
solutions for the overall problem is a bounded set and that M is large enough so that 
it will not eliminate any feasible solutions.) This formulation is equivalent to the set 
of constraints 


3x, + 2x, = 18 + yM 
x, + 4x4. = 16+ (1 — yM. 


Because the auxiliary variable y must be either zero or 1, this formulation guarantees 
that one of the original constraints must hold while the other is, in effect, eliminated. 
This new set of constraints would then be appended to the other constraints in the 
overall model to give a pure or mixed IP problem (depending upon whether the x; are 
integer or continuous variables). 

This approach is related directly to our earlier discussion about expressing com- 
binatorial relationships in terms of questions that must be answered yes or no. The 
combinatorial relationship involved concerns the combination of the other constraints 
of the model with the first of the two alternative constraints and then with the second. 
Which of these two combinations of constraints is better (in terms of the value of the 
objective function that then can be achieved)? To rephrase this question in yes-or-no 
terms, we ask two complementary questions: 


1. Should x, + 4x, = 16 be selected as the constraint that must hold? 
2. Should 3x, + 2x, = 18 be selected as the constraint that must hold? 


Because exactly one of these questions is to be answered affirmatively, we let the 
binary terms, y and (1 — y), respectively, represent these yes-or-no decisions, so that 
y + (1 — y) = 1 (one yes) automatically. If instead separate binary variables, y, 
and y,, had been used to represent these yes-or-no decisions, then an additional con- 
straint, y} + yə = 1, would have been needed to make them mutually exclusive. 

A formal presentation of this approach is given next for a more general case. 


K Out of N Constraints Must Hold 


Consider the case where the overall model includes a set of N possible constraints 
such that only some K of these constraints must hold. (Assume that K < N.) Part of 
the optimization process is to choose which combination of K constraints permits the 
objective function to reach its best possible value. The (N — K) constraints not chosen 
are, in effect, eliminated from the problem, although feasible solutions might coinci- 
dentally still satisfy some of them. 

This case is a direct generalization of the preceding case, which had K = 1 and 
N = 2. Denote the N possible constraints by 


fix, Xz, PI ots Xp) = d, 


Tipteers t Sd, 


FiO X2 -© o o Xp) S dy. 
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Then, applying the same logic as for the preceding case, we find that an equivalent 
formulation of the requirement that some K of these constraints must hold is 


f (4), %,...  X,) S d + My, 
fot, Xas’ Xy) s d, F My, 
Ín X2 - - + > Xp) = dy + Myy 
N 
> y, = N—- K, 
i=l 
and y; is binary, fori = 1,2,...,N, 


where M is an extremely large positive number. Because the constraints on the y; 
guarantee that K of these variables will equal zero and those remaining will equal 1, 
K of the original constraints willbe unchanged and the rest will, in effect; be elimi- 
nated. The choice of which K of these constraints should be retained is made by 
applying the appropriate algorithm to the overall problem so it finds an optimal solution 
for all of the variables simultaneously. l 


i Functions with N Possible Values 


Consider the situation where a given function is required. to take on any. one of N 
given values. Denote this requirement by 


fi Xa e oea X) = a, or d,..., or dy 
One special case is where this function is 
n 
FO, My wy X_) = y A;X;s 
for 
as on the left-hand side of a linear programming constraint. Another special case is 
where f(x), %2,-..,4%,) = x; for a given value of j, so the requirement becomes 


that x; must take on any one of N given values. 
The equivalent IP formulation of this requirement is the following: 


N 
fF, XQ, 06: s Xn) T > dyi 


y= 1l 


Me 


ii 
tet 


t 
and y; is binary, fori = 1,2,...,N, 


so this new set of constraints would replace this requirement in the statement of the 
overall problem. This set of constraints provides an equivalent formulation because 
exactly one y; must equal 1 and the others must equal zero, so exactly one d; is being 
chosen as the value of the function. In this case, there are N yes-or-no questions being 
asked, namely, should d, be the value chosen (i = 1, 2, ..., N)? Because the y; 
respectively represent these yes-or-no decisions, the second constraint makes them 
mutually exclusive alternatives. 


To illustrate how this case can arise, reconsider the Wyndor Glass Co. problem 
presented in Sec. 3.1. Eighteen percent of the total production capacity of Plant 3 
currently is unused and available for the two new products or for certain future 
products that will be ready for production soon. In order to leave any remaining 
capacity in usable blocks for these future products, management now wants to impose 
the restriction that the amount of capacity used by the two current new products must 
be 6 percent or 12 percent or 18 percent. Thus the third constraint of the original 
model (3x, + 2x, = 18) now becomes 


3x, + 2x, = 6 or 12 or 18. 


In the preceding notation, N = 3 with d, = 6, d, = 12, and d} = 18. Consequently, 
management’s new requirement should be formulated as follows: 


3x, + 2x, = 6y, + 12y, + 18y, 
Vib Jot yg = 1 
and Yi» Y2, Y3 are binary. 


The overall model for this new version of the problem then consists of the original 
model (see Sec. 3.1) plus this new set of constraints that replaces the original third 
constraint. This replacement yields a very tractable MIP formulation. 


The Fixed-Charge Problem 


It is quite common to incur a fixed charge or setup cost when undertaking an activity. 
For example, such a charge occurs when a production run to produce a small batch 
of a particular product is undertaken and the required production facilities must be set 
up to initiate the run. In such cases the total cost of the activity is the sum of a variable 
cost related to the level of the activity and the setup cost required to initiate the 
activity. Frequently the variable cost will be at least roughly proportional to the level 
of the activity. If it is, the total cost of the activity (say, activity j) can be represented 
by a function of the form 


k; + cx; ifx, > 0 
S 6 ae e J IOP J 
FQ) fy ifa = 0 


where x; denotes the level of activity j (x; = 0), k; denotes the setup cost, and c, 
denotes the cost for each incremental unit. Were it not for the setup cost k;, this cost 
structure would suggest the possibility of a linear programming formulation to deter- 
mine the optimal levels of the competing activities. Fortunately, even with the k;, MIP 
can still be used. 

To formulate the overall model, suppose that there are n activities, each with 
the preceding cost structure (with k, = 0 in every case and k; > O for some j = 
1, 2,...,m), and that the problem is to 


Minimize Z= fi) + fo) + -°- + f,(x,), 
subject to given linear programming constraints. 


To convert this problem into an MIP format, we begin by posing n questions 
that must be answered yes or no; namely, for each j = 1, 2,...,n, should activity 
j be undertaken (x; > 0)? Each of these yes-or-no decisions is then represented by an 
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auxiliary binary variable y,, so that 


Z= > (cx; + kyy,). 
= 


h _ Sh, if x; > 0 
oT »= lo ifx; = 0. 


Therefore, the y; can be viewed as contingent decisions similar to (but not identical 
to) the type considered in Sec. 13.1. Let M be an extremely large positive number 
that exceeds the maximum feasible value of any Xj (j= 1,2,...,n). Then the 
constraints, 


x; = My; forj = 1,2,.. <37, 


will ensure that y, = 1 rather than zero whenever x, > 0. The one difficulty remaining 
is that these constraints leave y, free to be either zero or 1 when x, = 0. Fortunately, 
this difficulty is automatically resolved because of the nature of the objective function. 
The case where k; = 0 can be ignored because y; can then be deleted from the 
formulation. So we consider the only other case, namely, where k; > 0. When x; = 
0, so that the constraints permit a choice between y, = 0 and y, = 1, y; = 0 must 
yield a smaller value of Z than y; = 1. Therefore, because the objective is to minimize 
Z, an algorithm yielding an optimal solution would always choose y, = 0 when 
x; = 0. 
To summarize, the MIP formulation of the fixed-charge problem is 


Minimize = 2 (c;x; + ky; 
subject to the original constraints, plus 
x; — My; =0 
and y; is binary, forj=1,2,...,n. 


If the x; also had been restricted to be integer, then this would be a pure IP problem. 
To illustrate this approach, look again at Sec. 3.4 at the air pollution problem 
faced by the Nori & Leets Co. The first of the abatement methods considered— 
increasing the height of the smokestacks—actually would involve a substantial fixed 
charge to get ready for any increase in addition to a variable cost that would be 
roughly proportional to the amount of increase. After conversion to the equivalent 
annual costs used in the formulation, this fixed charge would be $2,000,000 each for 
the blast furnaces and the open-hearth furnaces, whereas the variable costs are those 
identified in Table 3.14. Thus, in the preceding notation, k; = 2, k, = 2, c = 8, 
and c, = 10. Because the other abatement methods do not involve any fixed charges, 
= 0 forj = 3, 4, 5, 6. Consequently, the new MIP formulation of this problem 

is 





Minimize = 8x, + 10x. + 7x, + 6x4 + lix; + 9x6 + 2y, + 2yo, 
subject to the constraints given in Sec. 3.4, plus 
x, — My, =0 
— My, = 0, 


and Yı, Yz are binary. 


Binary Representation of General Integer Variables 


Suppose that you have a pure IP problem where most of the variables are binary 
variables, but the presence of a few general integer variables prevents solving the 
problem by one of the very efficient BIP algorithms now available. A nice way to 
circumvent this difficulty is to use the binary representation for each of these general 
integer variables. Specifically, if the bounds on an integer variable x are 


O<x<u, where 2% < u < 2Nt), 


then its binary representation is 
N 
aa > 2'Yjs 
i=0 


where the y; variables are (auxiliary) binary variables. Substituting this binary repre- 
sentation for each of the general integer variables (with a different set of auxiliary 
binary variables for each) thereby reduces the entire problem to a BIP model. 

For example, suppose that an IP problem has just two general integer variables, 
x, and x,, along with many binary variables, and that the functional constraints include 


xy = 5 


2x, + 3x, = 30. 
Using u = 5 for x, and u = 10 for x5, their binary representations become 


Xx; = Yo + 2y, + 4y2 
Xp = y3 + 2y4 + 4ys + 8ye- 


After substituting these expressions for the respective variables throughout all the 
functional constraints and the objective function, the two functional constraints noted 
above become 


Yo + 2y, + 4y2 = 5 


2yo + 4y, + 8y2 + 3y3 + 6y, + 12y; + 24y6 = 30. 


Note that each feasible value of x, corresponds to one of the feasible values of the 
vector (yo, Yı; Y2), and similarly for x, and (y3, ya, Ys» Ye): 

For an IP problem where all the variables are (bounded) general integer vari- 
ables, it would be possible to use this same technique to reduce the problem to a BIP 
model. However, this is not advisable for most cases because of the explosion in the 
number of variables involved. Applying a good IP algorithm to the original IP model 
generally should be more efficient than applying a good BIP algorithm to the much 
larger BIP model. 

In general terms, for all the formulation possibilities with auxiliary binary vari- 
ables discussed in this section, we need to strike the same note of caution. This 
approach sometimes requires adding a relatively large number of such variables, which 
can make the model computationally infeasible. In fact, as the next section explains, 
you may even be in trouble with less than a hundred binary variables. 
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13.3 Some Perspectives on Solving Integer 
Programming Problems 


It may seem that IP problems should be relatively easy to solve. After all, linear 
programming problems can be solved. extremely. efficiently, and the only difference 
is that IP problems have far fewer solutions to be considered. In fact, pure IP problems 
with a bounded feasible region are guaranteed to have just a finite number of feasible 
solutions. 

Unfortunately, there are two fallacies in this line of reasoning. One is that having 
a finite number of feasible solutions ensures that the problem is readily solvable. Finite 
numbers can be astronomicaily large. For example, consider the simple case of BIP 
problems. With n variables, there are 2” solutions to be considered (where some of 
these solutions can subsequently be discarded becatise they violate the functional 
constraints). Thus, each time n is increased by one, the number of solutions is doubled. 
This pattern is referred to as the exponential growth of the difficulty of the problem. 
With n =. 10, there are more than a thousand solutions (1,024); with n = 20, there 
are more than a million; with n = 30, there are more than a billion; and so forth. 
Therefore, even the fastest computers. are incapable of performing exhaustive enu- 
meration (checking each solution for feasibility and, if it is. feasible, calculating the 
value of the objective value) for BIP problems with more than a few dozen variables, 
let alone for general IP problems with the same number of integer variables. Sophis- 
ticated algorithms, such as those described in subsequent sections, can do somewhat 
better. In fact, Sec. 13.6 discusses how recently developed algorithms have success- 
fully solved certain vastly larger BIP problems (up to 2,756 variables). Nevertheless, 
because of exponential growth, even the best algorithms cannot be guaranteed to solve 
every relatively small problem (less than a hundred binary or integer variables). 

The second fallacy is that removing some feasible solutions (the noninteger ones) 
from a linear programming problem will make it easier to solve. To the contrary, it 
is only because all of these feasible solutions are there that the guarantee can be given 
(see Sec. 5.1) that there will be a corner-point feasible solution (basic feasible solution) 
that is optimal for the overall problem. This guarantee is the key to the remarkable 
efficiency of the simplex method. As a result, linear programming problems generally 
are much easier to solve than IP problems. 

Consequently, most successful algorithms for integer programming incorporate 
the simplex method (or dual simplex method) as much as they can by relating portions 
of the IP problem under consideration to the corresponding linear programming prob- 
lem (i.e., the same problem except that the integer restriction is deleted). For any 
given IP problem, this corresponding linear programming problem commonly is re- 
ferred to as its LP-relaxation. The algorithms presented in the next two sections 
illustrate how a sequence of LP-relaxations for portions of an IP problem can be used 
to solve the overall IP problem effectively. 

There is one special situation where solving an JP problem is no more difficult 
than solving its LP-relaxation once by the simplex method, namely, when the optimal 
solution to the latter problem turns out to satisfy the integer restriction of the IP 
problem. When this situation occurs, this solution must be optimal for the IP problem 
as well, because it is the best solution among all the feasible solutions for the LP- 
relaxation, which includes all the feasible solutions for the IP problem. Therefore, it 
is common for an IP algorithm to begin by applying the simplex method to the LP- 
relaxation to check whether this fortuitous outcome has occurred. 
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Although it generally is quite fortuitous indeed for the optimal solution to the 
LP-relaxation to be integer as well, there actually exist several special types of IP 
problems for which this outcome is guaranteed. You already have seen the most 
prominent of these special types in Chaps. 7 and 10, namely, the minimum cost flow 
problem (with integer parameters) and its special cases (the transportation problem, 
the transshipment problem, the assignment problem, the shortest path problem, and 
the maximum flow problem). The reason that this guarantee can be given for these 
types of problems is that they possess a certain special structure (e.g., see Table 7.6) 
that ensures that every basic feasible solution is integer, as stated in the integer 
solutions property given in Secs. 7.1 and 10.6. Consequently, these special types of 
IP problems can be treated like linear programming problems (which is why three of 
them appear in Chap. 7), because they can be solved completely by a streamlined 
version of the simplex method. 

Although this much simplification is somewhat unusual, in practice IP problems 
frequently have some special structure that can be exploited to simplify the problem. 
Sometimes, very large versions of these problems can be solved successfully. Special- 
purpose algorithms designed specifically to exploit certain kinds of special structures 
are becoming increasingly important in integer programming. 

Thus the two primary determinants of computational difficulty for an IP problem 
are (1) the number of integer variables and (2) the structure of the problem. This 
situation is.in contrast to linear programming, where the number of (functional) 
constraints is much more important than the number of variables. In integer program- 
ming, the number of constraints is of some importance (especially if LP-relaxations 
are being solved), but it is strictly secondary to the other two factors. In fact, there 
occasionally are cases where increasing the number of constraints decreases the com- 
putation time because the number of feasible solutions has been reduced. For MIP 
problems, it is the number of integer variables rather than the total number of variables 
that is important, because the continuous variables have almost no effect on compu- 
tational effort. l 

Because IP problems are, in general, much more difficult to solve than linear 
programming problems, it sometimes is tempting to use the approximate procedure 
of simply applying the simplex method to the LP-relaxation and then rounding the 


noninteger values to integers in the resulting solution. This approach may be adequate - 


for some applications, especially if the values of the variables are quite large so that 
rounding creates relatively little error. However, you should beware of two pitfalls 
involved in this approach. 

One pitfall is that the optimal linear programming solution is not necessarily 
feasible after it is rounded. Often it is difficult to see in which way the rounding 
should be done to retain feasibility. It may even be necessary to change the value of 
some variables by one or more units after rounding. To illustrate, suppose that some 
of the constraints are 

—-x,+%S 33 
xX; + x, = 163 


and that the simplex method has identified the optimal solution for the LP-relaxation 
as x, = 62, xX. = 10. Notice that it is impossible to round x, to 6 or 7 (or any other 
integer) and retain feasibility. Feasibility can be retained only by also changing the 
integer value of x. It is easy to imagine how such difficulties can be compounded 
when there are tens or hundreds of constraints and variables. 
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Even if the optimal solution for the LP-relaxation is rounded successfully, there 
remains another pitfall. There is no. guarantee that this rounded solution will be the 
optimal integer solution. In fact, it may even be far from optimal in terms of the value 
of the objective function. This fact is illustrated by the following problem: 


Maximize Z= x + 5x5, 


subject to x, + 10x, = 20 
xy SS 2 
and x, 20, xX, 20 


X, X, are integers. 


Because there are only two decision variables, this problem can be depicted graphically 
as shown in Fig. 13.1. Either the graph or the simplex method may be used to find 
that the optimal solution for the LP-relaxation is x; = 2, x. = 3, with Z = 11. Ifa 
graphical solution were not available (which it would not be with more decision 
variables), then the variable with the noninteger value x, = 2 would normally be 
rounded in the feasible direction to x, = 1. The resulting integer solution is x, = 2, 
x, = 1, which yields Z = 7. Notice that this solution is far from the optimal solution 
(xis %2) = (0, 2), where Z = 10. 

Because of these two pitfalls, a better approach for dealing with IP problems 
that. are too large to be solved exactly is to use one of the available heuristic algo- 
rithms. These algorithms are extremely. efficient for large problems, but they are not 
guaranteed to find an optimal solution. However, they do tend to be considerably 
more effective than the rounding approach just discussed in finding very good feasible 
solutions. 

For IP problems that are small enough to be solved to optimality, a considerable 
number of algorithms now are available. Until recently, no IP algorithms possessed 
computational efficiency that was even remotely comparable to the simplex method 
(except on special types of problems). Therefore, developing IP algorithms has con- 
tinued to be an active area of research. Fortunately, some exciting algorithmic ad- 
vances were made during the mid- and late 1980s, and additional progress can be 





0 1 2 3 xy 
Figure 13.1 Illustrative integer programming problem. 


anticipated during the next decade. These recent advances are discussed further in 
Sec. 13.6, 

The most popular mode for IP algorithms is to use the branch-and-bound tech- 
nique and related ideas to implicitly enumerate the feasible integer solutions, and we 
shall focus on this approach. The next section presents the branch-and-bound technique 
in a general context, and illustrates it with a basic branch-and-bound algorithm for 
BIP problems. Section 13.5 presents another algorithm of the same type for general 
MIP problems. 


13.4 The Branch-and-Bound Technique and Its Application 
to Binary Integer Programming 


Because any bounded pure IP problem has only a finite number of feasible solutions, 
it is natural to consider using some kind of enumeration procedure for finding an 
optimal solution. Unfortunately, as we discussed in the preceding section, this finite 
number can be, and usually is, very large. Therefore, it is imperative that any enu- 
meration procedure be cleverly structured so that only a tiny fraction of the feasible 
solutions actually need be examined. For example, dynamic programming (see Chap. 
11) provides one such kind of procedure for many problems having a finite number 
of feasible solutions (although it is not particularly efficient for most IP problems). 
Another such approach is provided by the branch-and-bound technique. This tech- 
nique and variations of it have been applied with some success to a variety of oper- 
ations research problems, but it is especially well known for its application to IP 
problems. 

The basic concept underlying the branch-and-bound technique is to divide and 
conquer. Since the original “‘large’’ problem is too difficult to be solved directly, it 
is divided into smaller and smaller subproblems until these subproblems can be con- 
quered. The dividing (branching) is done by partitioning the entire set of feasible 
solutions into smaller and smalier subsets. The conquering (fathoming) is done par- 
tially by bounding how good the best solution in the subset can be, and then discarding 
the subset if its bound indicates that it cannot possibly contain an optimal solution for 
the original problem. 

We shall now describe in turn these three basic steps—branching, bounding, 
and fathoming—and illustrate them by applying a branch-and-bound algorithm to the 
prototype example (the California Manufacturing Co. problem) presented in Sec. 13.1. 


Branching 


When dealing with binary variables, the most straightforward way to partition the set 
of feasible solutions into subsets is to fix the value of one of the variables (say, x,) 
at x, = 0 for one subset and at x, = 1 for the other subset. Doing this for the 
prototype example divides the whole problem into the two smaller subproblems shown 
below. 


Subproblem 1: (x, = 0) 


Maximize Z = 5x, + 6x, + 4x4, 
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subject to 3x5 + 5x, + 2x, = 10 

x + x= 1 

X3 = 0 
and x; is binary, forj = 2,3, 4. 


Subproblem 2: (x, = 1) 
Maximize Z=9 + 5x + 6x3 + 4x4, 
subject to 3x2 + 5x3. + 2x4 4 


x, + Xgl 


X3 =1 
— Xo + X4 = 0 
and x; is binary, forj = 2, 3, 4. 


Figure 13.2 portrays this dividing (branching) into subproblems by a tree (defined in 
Sec. 10.2) with branches (arcs) from the All node (corresponding to the whole problem 
having All feasible solutions) to the two nodes corresponding to the two subproblems. 
This tree, which will continue ‘‘growing branches’’ iteration by iteration, is referred 
to as the solution tree (or enumeration tree) for the algorithm. The variable used to 
do this branching at any iteration by assigning values to the variable (as with x, above) 
is called the branching variable. 

Later in the section you will see that one of these subproblems can be conquered 
(fathomed) immediately, whereas the other subproblem will need to be divided further 
into smaller subproblems by setting x, = O or x, = 1, etc. 

For other IP problems where the integer variables have more than two possible 
values, the branching can still be done by setting the branching variable at its respec- 
tive individual values, thereby creating more than two new subproblems. However, 
a good alternate approach is to specify a range of values (e.g., x; = 2 or x; = 3) for 
the branching variable for each new subproblem. This is the approach used for the 
algorithm presented in Sec. 13.5. 


Variable: x 


Figure 13.2 The solution tree created by the branching for the first 
iteration. of the BIP branch-and-bound algorithm for the example. 





Bounding 


For each of these subproblems, we now need to obtain a hound on how good its best 
feasible solution can be. The standard way of doing this is to quickly solve a simpler 
relaxation of the subproblem. In most cases. a relaxation of a problem is obtained 
simply by deleting (‘‘relaxing’’) one set of constraints that had made the problem 
difficult to solve. For IP problems, the most troublesome constraints are those re- 
quiring the respective variables to be integer. Therefore, the most widely used relax- 
ation is the LP-relaxation that deletes this set of constraints. 

To illustrate for the example, consider first the whole problem given in Sec. 
13.1. Its LP-relaxation is obtained by deleting the last line of the model (x; is an 
integer, for j = 1, 2, 3, 4), but retaining the x, = | and x, = O constraints. Using 
the simplex method to quickly solve this LP-relaxation yields its optimal solution, 


(xis X2; X3, x4) = GAL), with Z = 16}. 


Therefore, Z = 164 for all feasible solutions for the original BIP problem (since these 
solutions are a subset of the feasible solutions for the LP-relaxation). In fact, as 
summarized below, this bound of 164 can be rounded down to 16, because all coef- 
ficients in the objective function are integer, so all integer solutions must have an 
integer value for Z. 


Bound for whole problem: Z <= i6. 


Now let us obtain the bounds for the two subproblems in the same way. Their 
LP-relaxations are obtained from the models in the preceding subsection by replacing 
the constraints, x; is binary for j = 2, 3, 4, by 0 =x, = 1 for j = 2, 3, 4. Applying 
the simplex method then yields their optimal solutions (plus the fixed value of x) 
shown below. 


LP-relaxation of Subproblem 1: (x,. X2, x3, x4) = (0, 1, 0, 1), with Z = 9. 
LP-relaxation of Subproblem 2: (x,, x2, x3, x4) = (1, $, 0, $), with Z = 168. 


The resulting bounds for the subproblems then are 


Bound for Subproblem 1: Z= 9, 
Bound for Subproblem 2: Z = 16. 


Figure 13.3 summarizes these results, where the numbers given just below the 
nodes are the bounds, and below each bound is the optimal solution obtained for the 
LP-relaxation. 


Fathoming 


A subproblem can be conquered (fathomed), and thereby dismissed from further con- 
sideration, in the three ways described below. 

One way is illustrated by the results for Subproblem 1 given by the x, = 0 node 
in Fig. 13.3. Note that the (unique) optimal solution for its LP-relaxation, (x,, x2, X3, 
x4) = (0, 1, 0, 1), is an integer solution. Therefore, this solution must also be the 
optimal solution for Subproblem 1 itself. This solution should be stored as the first 
incumbent (the best feasible solution found so far) for the whole problem, along with 
its value of Z. This value is denoted by 


Z* = value of Z for the current incumbent, 
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Variable: x, 


Figure 13.3 The results of bounding for the first iteration of the BIP 
branch-and-bound algorithm for the example. 





so.Z* = 9 at this point. Having stored this solution, there is no reason to consider 
Subproblem 1 any further by branching from the x, = 0 node, etc. Doing so could 
only lead to other feasible solutions that are inferior to the incumbent, and we have 
no interest in such solutions. Because it has been solved, we fathom (dismiss) Sub- 
problem 1 now. 

The above results suggest a second key fathoming test. Since Z* = 9, there is 
no reason to consider further any subproblem whose bound = 9, since such a sub- 
problem cannot have a feasible solution better than the incumbent. Stated more gen- 
erally, a subproblem is fathomed whenever its 


Bound = Z*. 


This outcome does not occur in the current iteration of the example because Sub- 
problem 2 has a bound of 16 that is larger than 9. However, it might occur later for 
descendants of this subproblem (new smaller subproblems created by branching on 
this subproblem, and then perhaps- branching further through subsequent ‘‘genera- 
tions’’). Furthermore, as new incumbents with larger values of Z* are found, it will 
become easier to fathom in this way. 

The third way of fathoming is quite straightforward. If the simplex method finds 
that a subproblem’s LP-relaxation has no feasible solutions, then the subproblem itself 
must have no feasible solutions, so it can be dismissed (fathomed). 

In all three cases, we are conducting our search for an optimal solution by 
retaining for further investigation only those subproblems that could possibly have a 
feasible solution better than the current incumbent. 


Summary of Fathoming Tests 


A subproblem is fathomed (dismissed from further consideration) if 
Test 1: Its bound = Z*, 
or 
Test 2: Its LP-relaxation has no feasible solutions, 
or 
Test 3: The optimal solution for its LP-relaxation is integer. (If this solution is 
better than the incumbent, it becomes the new incumbent, and test 1 is reapplied 
to all unfathomed subproblems with the new larger Z*.) 
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9 = Zz 
(0, 1,0, 1) = incumbent 






Figure 13.4 The solution tree after the first iteration of the 
16 BIP branch-and-bound algorithm for the example. 


Figure 13.4 summarizes the results of applying these three tests to Subproblems 
1 and 2 by showing the current solution tree. Only Subproblem 1 has been fathomed, 
by test 3, as indicated by the F(3) next to the x, = 0 node. The resulting incumbent 
also is identified below this node. 

The subsequent iterations will illustrate successful applications of all three tests. 
However, before continuing the example, let us summarize the algorithm being ap- 
plied to this BIP problem. (This algorithm assumes that all coefficients in the 
objective function are integer, and that the ordering of the variables for branching is 
Xi, Xo, . Hye) 


Summary of BIP Branch-and-Bound Algorithm 


Initialization Step: Set Z* = —, Apply the bounding step, fathoming step, and 
optimality test described below to the whole problem. If not fathomed., classify this 
problem as the one remaining ‘‘subproblem’’ for performing the first full iteration 
below. 


Steps for Each Iteration: 


1. Branching: Among the remaining (unfathomed) subproblems, select the one 
that was created most recently. (Break ties according to which has the larger 
bound.) Branch from the node for this subproblem to create two new sub- 
problems by fixing the next variable (the branching variable) at either 0 
or H. 

2. Bounding: For each new subproblem, obtain its bound by applying the sim- 
plex method to its LP-relaxation and rounding down the value of Z for the 
resulting optimal solution. 

3. Fathoming: For each new subproblem, apply the three fathoming tests sum- 
marized above, and discard those subproblems that are fathomed by any of 
the tests. 


Optimality Test: Stop when there are no remaining subproblems; the current incum- 
bent is optimal.’ Otherwise, return to perform another iteration. 


The branching step for this algorithm warrants a comment as to why the sub- 
problem to branch from is selected in this way. One option not used would have been 


1 Tf there is no incumbent, the conclusion is that the problem has no feasible solutions. 


474 


Mathematical 
Programming 


always to select the remaining subproblem with the best bound, because this sub- 
problem would be the most promising one to contain an optimal solution for the whole 
problem. The reason for using instead the option of selecting the most recently created 
subproblem is that LP-relaxations are being solved in the bounding step. Rather than 
starting the simplex method from scratch each time, each LP-relaxation generally is 
solved by reoptimization in large-scale implementations of this algorithm. This reop- 
timization involves revising the final simplex tableau from the preceding LP-relaxation 
as needed because of the few differences in the model (just as for sensitivity analysis), 
and then applying a few iterations of perhaps the dual simplex method. This reopti- 
mization tends to be much faster than starting from scratch, provided the preceding 
and current models are closely related. The models will tend to be closely related 
under the branching rule used, but not when you are skipping around in the solution 
tree by selecting the subproblem with the best bound. 


Completing the Example 


The pattern for the remaining iterations will be quite similar to that for the first iteration 
described above except for the ways in which fathoming occurs. Therefore, we shall 
summarize. the branching and bounding steps fairly briefly and then focus on the 
fathoming step. 


ITERATION 2: The only remaining subproblem corresponds to the x, = 1 node in 
Fig. 13.4, so we shall branch from this node to create the two new subproblems given 
below. 


Subproblem 3: (x, = 1, x, = 0) 
Maximize Z = 9+ 6x3 + 4x4, 
subject to 5x, + 2x, = 4 


x, + x Sl 


X3 =1 
x, =0 
and xj is binary, forj = 3,4. 


Subproblem 4: (x; = 1,x% = 1) 
Maximize Z = 14 + 6x3 + 4x4, 
subject to 5x3 + 2x, = 1 


x; + x, Zl 


X3 =1 
X4 = 0 
and x; is binary, for j = 3, 4. 


The LP-relaxations of these subproblems are obtained by replacing the con- 
straints, x; is binary for j = 3, 4, by 0 =x; = 1 for j = 3, 4. Their optimal solutions 


(plus the fixed values of x, and x,) are 


LP-relaxation of Subproblem 3: (x,, x2, x3, x4) = (1, 0, $, 0), with Z= 13, 
LP-relaxation of Subproblem 4: (x,, X2, x3, x9) = (1, 1, 0, 3), with Z = 16. 


The resulting bounds for the subproblems are 


Bound for Subproblem 3: Z = 13, 
Bound for Subproblem 4: Z <= 16. 


Note that both of these bounds are larger than Z* = 9, so fathoming test 1 fails 
in both cases. Test 2 also fails, since both LP-relaxations have feasible solutions (as 
indicated by the existence of an optimal solution). Alas, test 3 fails as well, because 
both optimal solutions include variables with noriinteger values. — : 

Figure 13.5 shows the resulting solution tree at this point. The lack of an F to 
the right of either new node indicates that both remain unfathomed. 


ITERATION 3: So far, the algorithm has created four subproblems. Subproblem 1 
has been fathomed, and Subproblem 2 has been replaced by (separated into) Sub- 
problems 3 and 4, but these latter two remain under consideration. Because they were 
created simultaneously, but Subproblem 4 (x, = 1, x, = 1) has the larger bound 
(16 > 13), the next branching is done from the (x,, x.) = (1, 1) node in the solution 
tree, which creates the following new subproblems. 





Subproblem 5: (x, = 1, x, = 1, x, = 0) 
Maximize Z = 14 + 4x4, 
subject to 2x, = 1 
x4 = 1 (twice) 
and x, is binary. 
Subproblem 6: (x, = 1, x = 1, x, = 1) 
Maximize Z = 20 + 4x, 


Variable: x X 






9=Z 
(0, 1, 0; 1) = incumbent 


) Figure 13.5 The solution tree after iteration 2 of the 
BIP branch-and-bound algorithm for the example. 
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subject to 2x, = —4 
x= 0 
yS” 

and x, is binary. 


Forming their LP-relaxations by replacing x, is binary by 0 = x, = 1, the 
following results are obtained: 


LP-relaxation of Subproblem 5: (x,, x), X3, x4) = (1, 1, 0, 3), with Z = 16. 
LP-relaxation of Subproblem 6: No feasible solutions. 
Bound for Subproblem 5: Zs 16. 


Note how the constraints, 2x, = —4 and x, = 0, in the LP-relaxation of Subproblem 
6 prevent any feasible solutions. Therefore, this subproblem is fathomed by test 2. 
However, Subproblem 5 fails this test, as well as test 1 (16 > 9) and test 3. (x, = $ 
is not integer), so it remains under consideration. 

We now. have the solution tree shown. in Fig. 13.6. 


ITERATION 4: The subproblems corresponding to nodes (1, 0) and (1, 1, 0) in Fig. 
13.6 remain under consideration, but the latter node was created more recently, so it 
is selected for branching from next. Since the resulting branching variable x, is the 
last variable, fixing its value at either 0 or 1 actually creates a single solution rather 
than subproblems requiring fuller investigation. These single solutions are 


x4 = 0: (xi; X2, X3; X4) = (1, 1, 0, 0) is feasible, with Z = 14, 
x4 = 1: (xi; X2, X3, X4) = (1, 1, 0, 1) is infeasible. 


Formally applying the fathoming tests, the first solution passes test 3 and the second 
passes test 2. Furthermore, this feasible first solution is better than the incumbent 
(14 > 9), so it becomes the new incumbent, with Z* = 14. 


Variable: Xx, Xz X3 






9 = Z* 
(0, 1.0, 1) = incumbent 


Figure 13.6 The solution tree after iteration 3 of the BIP branch-and-bound algorithm for the example. 


Because a new incumbent has been found, we now reapply fathoming test 1 


with the new larger value of Z* to the only remaining subproblem, the one at node 
(1, 0). 


Subproblem 3: Bound = 13 s Z* = 14. 


Therefore, this subproblem now is fathomed. 

We now have the solution tree shown in Fig. 13.7. Note that there are no 
remaining (unfathomed) subproblems: Consequently, the optimality test indicates that 
the current incumbent, 


(x, Xz, X35 X4) a (d, 1, 0, 0), 


is optimal, so we are done. 


Other Options with the Branch-and-Bound Technique 


This section has illustrated the branch-and-bound technique by describing a basic 
branch-and-bound algorithm for solving BIP problems. However, the general frame- 
work of the branch-and-bound technique provides a great deal of flexibility in how to 
design a specific algorithm for any given type of problem such as BIP. There are 
many options available, and constructing an efficient algorithm requires tailoring the 
specific design to fit the specific structure of the problem type. 

Every branch-and-bound algorithm has the same three basic steps of branching, 
bounding, and fathoming. The flexibility is in how these steps are performed. 

Branching always involves selecting one remaining subproblem and dividing it 
into smaller subproblems. The flexibility here is in the rules for selecting and dividing. 
Our BIP algorithm selected the most recently created subproblem, because this is very 
efficient for reoptimizing each LP-relaxation from the preceding one. Selecting the 
subproblem with the best bound is the other most popular rule, because it tends to 
lead more quickly to better incumbents and so more fathoming. Combinations of the 
two rules also can be used. The dividing typically (but not always) is done by choosing 


Variable: x, X X3 X4 






14 = Z* 


(1, 1, 0, 0) = 
incumbent 


= optimal solution 


F(2) 


Figure 13.7 The solution tree after the final (fourth) iteration of the BIP branch-and-bound algorithm 
for the example. 
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a branching variable and assigning it either individual values (e.g:, our BIP algorithm) 
or ranges of values (e.g.; the algorithm in the next section). More sophisticated 
algorithms generally use a rule for strategically choosing a branching variable that 
should tend to lead to early fathoming. 

Bounding usually is done by solving a relaxation. However, there are a variety 
of ways to form relaxations. For example, consider the Lagrangian relaxation, where 
the entire set of functional constraints, Ax < b (in matrix notation), is deleted (except 
possibly for any ‘‘convenient’’ constraints), and then the objective function, 


Maximize Z = ex, 
is replaced by 
Maximize Zp = ex — MAX — b), 


where the fixed vector à = 0. If x* is an optimal solution for the original problem, 
its Z = Zp, so solving the Lagrangian relaxation for the optimal value of Zp provides 
a valid bound. If N is chosen well, this bound tends to be a reasonably tight one (at 
least comparable to the bound from the LP-relaxation). Without any functional con- 
straints, this relaxation also can be solved extremely quickly. The drawbacks are that 
fathoming tests 2 and 3 (revised): are not as powerful as for the LP-relaxation. How- 
ever, the two relaxations occasionally. are used together to good advantage. 

In general terms, the two features sought in choosing a relaxation are that it can 
be solved relatively quickly and that it provides. a relatively tight bound. Neither alone 
is adequate. The LP-relaxation is popular because it provides an excellent trade-off 
between these two factors. 

One option occasionally employed is to use a quickly solved relaxation and 
then, if fathoming is not achieved, to tighten the relaxation in some way to obtain a 
somewhat tighter bound. 

Fathoming generally is done pretty much as described for the BIP algorithm. 
The three fathoming criteria can be stated in more general terms as follows. 


Summary of Fathoming Criteria 


A subproblem is fathomed if an analysis of its relaxation reveals that 
Criterion 1: Feasible solutions of the subproblem must have Z = Z*, or 
Criterion 2: The subproblem has no feasible solution, or 
Criterion 3: An optimal solution of the subproblem has been found. 


Just as for the BIP algorithm, the first two criteria usually are applied by solving the 
relaxation to obtain a bound for the subproblem, and then checking whether this bound 
is < Z* (test 1) or whether the relaxation has no feasible solutions (test 2). If the 
relaxation differs from the subproblem only by the deletion (or loosening) of some 
constraints, then the third criterion usually is applied by checking whether the optimal 
solution for the relaxation is feasible for the subproblem, in which case it must be 
optimal for the subproblem. For other relaxations (such as the Lagrangian relaxation), 
additional analysis is required to determine whether the optimal solution for the re- 
laxation is also optimal for the subproblem. 

If the original problem involves minimization rather than maximization, two 
options are available. One is to convert to maximization in the usual way (see Sec. 
4.6). The other is to convert the branch-and-bound algorithm directly. into minimi- 
zation form, where the most important adjustment is to change the direction of the 


inequality for fathoming test 1 from 
© Is the subproblem’s bound = Z*? 


to Is the subproblem’s bound = Z*? 


So far, we have described how to use the branch-and-bound technique to find 
only one optimal solution. However, in the case of ties for the optimal solution, it is 
sometimes desirable to identify all these optimal solutions so that the final choice 
among them can be made on the basis of intangible factors not incorporated into the 
mathematical model. To find them all, you need to make only a few slight alterations 
in the procedure. First, change the weak inequality for fathoming test 1 (Is the sub- 
problem’s bound = Z*?) to a strict inequality (Is the subproblem’s bound < Z*?), so 
that fathoming will not occur if the subproblem can have a feasible solution equal to 
the incumbent. Second, if fathoming test’3 passes and the optimal solution for the 
subproblem has Z = Z*, then store this solution as another (tied) incumbent. Third, 
if test 3 provides a new incumbent (tied or otherwise), then check whether the optimal 
solution obtained for the relaxation is unique. If it is not, then identify the other 
optimal solutions for the relaxation and check whether they are optimal for the sub- 
problem as well, in which case they also become incumbents. Finally, when the 
optimality test finds that there are no remaining (unfathomed) subsets, all the current 
incumbents will be the optimal solutions. 

Finally, it should be noted that rather than finding an optimal solution, the 
branch-and-bound technique can also be used to find a nearly optimal solution, gen- 
erally with much less computational effort. For some applications, a solution is ‘‘good 
enough”’ if its Z is ‘‘close enough’’ to the value of Z for an optimal solution (call it 
Z**), ‘“Close enough’’ can be defined as either 


Z2 Zt R or Z2( — a)Z** 


for a specified (positive) constant K or œ. For example, if a = 0.05, then the solution 
is required to be within 5 percent of optimal. To find a solution that is ‘‘close enough”? 
to being optimal, only one change is needed in the usual branch-and-bound procedure. 
This change is to replace the usual fathoming test 1 for a subproblem, 


Bound = Z*? 
by either Bound — K = Z*? 


or (1 — a) bound = Z*? 


and then perform this test after test 3 (so that a feasible solution found with Z > Z* 
is still kept as the new incumbent). The reason this weaker test 1 suffices is that 
regardless of how close Z for the subproblem’s (unknown) optimal solution is to the 
subproblem’s bound, the incumbent is still ‘‘close enough’’ to this solution (if the 
new inequality holds) that the subproblem does not need to be considered further. 
When there are no remaining subproblems, the current incumbent will be the desired 
nearly optimal solution. However, it is much easier to fathom with this new fathoming 
test (in either form), so the algorithm should run much faster. For a large problem, 
this acceleration may make the difference between finishing with a solution guaranteed 
to be close to optimal and never terminating. 
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13.5 A Branch-and-Bound Algorithm for Mixed 
Integer Programming 


We shall now consider the general. MIP problem, where some of the variables (say, 
I of them) are restricted to integer values (but not necessarily just 0 and 1), but the 
rest are ordinary continuous variables. For notational convenience, we shall order the 
variables so that the first J variables are the integer-restricted variables. Therefore, 
the general form of the problem being considered is 


n 
Maximize Z= > CjXj> 
j= 


subject to 2 ax; = b; fori = 1,2,....,m, 
jz 
and x; 20, forj=1,2,...,”, 
x; is integer, forj=1,2,...,2 Msn). 


(When J = n, this problem becomes the pure IP problem.) 

We shall describe a basic branch-and-bound algorithm for solving this problem 
that, with a variety of refinements, has provided the standard approach to MIP. The 
structure of this algorithm was first developed by R. J. Dakin,’ based on a pioneering 
branch-and-bound algorithm by A. H. Land and A. G. Doig.” 

This algorithm is quite similar in structure to the BIP algorithm presented in the 
preceding section. Solving LP-relaxations again provides the basis for both the bound- 
ing and fathoming steps. In fact, only four changes are needed in the BIP algorithm 
to deal with the generalizations from. binary to general integer variables and from 
pure IP to mixed IP. 

One change involves the choice of the branching variable. Before, the next 
variable in the natural ordering—x,, X2, .. . , x,—was chosen automatically. Now, 
the only variables considered are the integer-restricted variables that have a noninteger 
value in the optimal solution for the LP-relaxation of the current subproblem. Our 
rule for choosing among these variables is to select the first one in the natural ordering. 
(Production codes generally use a more sophisticated rule.) 

The second change involves the values assigned to the branching variable for 
creating the new smaller subproblems. Before, the binary variable was fixed at 0 and 
1, respectively, for the two new subproblems. Now, the general integer-restricted 
variable could have a very large number of possible integer values, and it would be 
inefficient to create and analyze many subproblems by fixing the variable at its indi- 
vidual integer values. Therefore, what is done instead is to create just two new sub- 
problems (as. before) by specifying two. ranges of values for the variable. 

To spell out how this is done, let x; be the current branching variable, and let 
x; be its (noninteger) value in the optimal solution for the LP-relaxation of the current 
subproblem: Using a square bracket to denote 


[x¥] = greatest integer less-than-or-equal-to x7, 
! Dakin, R. J.: ‘A Tree Search Algorithm for Mixed Integer Programming Problems,’? Computer Journal, 


8(3):250-255, 1965. 


? Land, A. H., and A. G. Doig: ‘‘An Automatic Method of Solving Discrete Programming Problems,” 
Econometrica, 28:497-520, 1960. 
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the range of values for the two new subproblems is 
x, = pi] and =x, = [x4] + 1, 


respectively. Each inequality becomes an additional constraint for that new subprob- 
lem. For example, if xf = 33, then 


xs 3S and x24 


are the respective additional constraints for the new subproblem. 

When the two changes to the BIP algorithm described above are combined, an 
interesting phenomenon of a recurring branching variable can occur. To illustrate, 
let j} = 1 in the above example where x} = 33, and consider the new subproblem 
where x, = 3. When the LP-relaxation of a descendant of this subproblem is solved, 
suppose that x; = 14. Then x, recurs as the branching variable, and the two new 
subproblems created have the additional constraint, x, = 1 and x, = 2, respectively 
(as well as the previous additional constraint, x, = 3). Later, when the LP-relaxation 
for a descendant of, say, the x, = 1 subproblem is solved, suppose that x} = #. Then 
x, recurs again as the branching variable, and the two new subproblems created have 
x, = 0 (because of the new x, = O constraint and the nonnegativity constraint on x,) 
and x, = 1 (because of the new x, = 1 constraint and the previous x, = 1 constraint). 

The third change involves the bounding step. Before, with a pure IP problem 
and integer coefficients in the objective function, the value of Z for the optimal solution 
for the subproblem’s LP-relaxation was rounded down to obtain the bound, because 
any feasible solution for the subproblem must have an integer Z. Now, with some of 
the variables not integer-restricted, the bound is the value of Z without rounding down. 

The fourth (and final) change to the BIP algorithm to obtain our MIP algorithm 
involves fathoming test 3. Before, with a pure IP problem, the test was that the 
optimal solution for the subproblem’s LP-relaxation is integer, since this ensures that 
the solution is feasible, and therefore optimal, for the subproblem. Now, with a mixed 
IP problem, the test requires only that the integer-restricted variables be integer in 
the optimal solution for the subproblem’s LP-relaxation, because this suffices to ensure 
that the solution is feasible, and therefore optimal, for the subproblem. 

Incorporating these four changes into the summary presented in the preceding 
section for the BIP algorithm yields the following summary for the new algorithm for 
MIP. 


Summary of MIP Branch-and-Bound Algorithm 


Initialization Step: Set Z* = — 9. Apply the bounding step, fathoming step, and 
optimality test described below to the whole problem. If not fathomed, classify this 
problem as the one remaining “‘subproblem’’ for performing the first full iteration 
below. 


Steps for Each Iteration: 


1. Branching: Among the remaining (unfathomed) subproblems, select the one 
that was created most recently. (Break ties according to which has the larger 
bound.) Among the integer-restricted variables that have a noninteger value 
in the optimal solution for the LP-relaxation of the subproblem, choose the 
first one in the natural ordering of the variables to be the branching variable. 
Let x; be this variable, and xt its value in this solution. Branch from the 
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node for the subproblem to- create two new subproblems by adding the re- 
spective constraints, x; = [x4] and x, = [xj] + 1. 

2. Bounding: For each new subproblem, obtain its bound by applying the sim- 
plex method (or the dual simplex method when reoptimizing) to its LP- 
relaxation and using the value of Z for the resulting optimal solution. 

3. Fathoming: For each new subproblem, apply the three fathoming tests given 
below, and discard those subproblems that are fathomed by any of the tests. 

Test I: Its bound = Z*, where Z* is the value of Z for the current 
incumbent. 

Test 2: Its 1-P-relaxation has no feasible solutions. 

Test 3: The optimal solution for tts LP-relaxation has integer values for 
the integer-restricted variables. (If this solution is better than the incumbent, 
it becomes the new incumbent and test 1 is reapplied to all unfathomed 
subproblems with the new larger Z*.) 


Optimality Test: Stop when there are no remaining subproblems; the current incum- 
bent is optimal.! Otherwise, return to perform another iteration. 


Example 
We will now illustrate this algorithm by applying it to the following MIP problem: 


Maximize Z = 4x, — 2x + Try - XY, 


subject to x + 5x, = 10 
Xp +X - ONG = 
6x, — 5x, = 0 
xX + 2x, — 2x, =. 3 
and x, = 0, forj = 1,2,3,4 
xX; is an integer, forj = 1,2,3 


Note that the number of integer-restricted variables is J = 3, so x, is the only con- 
tinuous variable. 


INITIALIZATION STEP: After setting Z* = —, we form the LP-relaxation of this 
problem by deleting the set of constraints, x; is an integer for j = 1, 2, 3. Applying 
the simplex method to this LP-relaxation yields its optimal solution below. 


LP-relaxation of whole problem: (x,. X2, x3, x4) = (4. 3,4, 0), with Z = 144. 


Because it has feasible solutions and this optimal solution has noninteger values for 
its integer-restricted variables, the whole problem is not fathomed, so the algorithm 
continues with the first full iteration below. 


ITERATION 1: In this optimal solution for the LP-relaxation, the first integer- 
restricted variable that has a noninteger value is x; = 3. so x, becomes the branching 


1 If there is no incumbent, the conclusion is that the problem has no feasible solutions. 


variable. Branching from the All node (All feasible solutions) with this branching 483 


variable then creates the following two subproblems: Integer Programming 


Subproblem 1: Original problem plus additional constraint: 


x, 1. 


Subproblem 2: Original problem plus additional constraint: 

x, = 2. 
Deleting the set of integer constraints again, and solving the resulting LP-relaxations 
of these two subproblems yields the following results. 


LP-relaxation of Subproblem 1: (x,, x2, x3, x,) = (1, £, 2, 0), with Z = 144. 
Bound for Subproblem 1: Z = 148. 
LP-relaxation of Subproblem 2: No feasible solutions. 


This outcome for Subproblem 2 means that it is fathomed by test 2. However, 
just as for the whole problem, Subproblem 1 fails all fathoming tests. 
These results are summarized in the solution tree shown in Fig. 13.8. 


ITERATION 2: With only one remaining subproblem, corresponding to the x, = 1 
node in Fig. 13.8, the next branching is from this node. Examining its LP-relaxation’s 
optimal solution given below, this node reveals that the branching variable is x), 
because x, = $ is the first integer-restricted variable that has a noninteger value. 
Adding one of the constraints, x, = 1 or x, = 2, then creates the following two new 
subproblems. 

Subproblem 3: Original problem plus additional constraints: 

xsl 


ysl. 


Subproblem 4: Original problem plus additional constraints: 
“sil 


w=, 


F(2) 
Figure 13.8 The solution tree after the first iteration of the 
MIP branch-and-bound algorithm for the example. 
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Solving their LP-relaxations gives the following results. 


LP-relaxation of Subproblem 3: (x1; x», X3, x4) = (@, 1, 4 0), with Z = 14%. 
Bound for Subproblem 3: Z= 144. 
LP-relaxation of Subproblem 4: (x,, X2, X3, x) = (@, 2, 4, 0), with Z = 12%. 
Bound for Subproblem 4: Z <= 12%. 


Because both solutions exist (feasible solutions) and have noninteger values for in- 
teger-restricted variables, neither subproblem is fathomed. (Test 1 still isn’t opera- 
tional, since Z* = —œ until the first incumbent is found.) 

The solution tree at this point is given in Fig. 13.9. 


ITERATION 3: With two remaining subproblems (3 and 4) that were created si- 
multaneously, the one with the larger bound (Subproblem 3, with 144 > 128) is 
selected for the next branching. Because x, = ¢ has a noninteger value in the optimal 
solution for this subproblem’s LP-relaxation, x, becomes the branching variable. 
(Note that x, now is a recurring branching variable, since it also was chosen at iteration 
1.) This leads to the following new subproblems. 


Subproblem 5: Original problem plus additional constraints: 
xsl 
xX =l 


x; =0 (so x, = 0). 


Subproblem 6: Original problem plus additional constraints: 
s= i 
Xx» = 1 
x,21 (sox, = 1). 

The results from solving their LP-relaxations are given below. 


LP-relaxation of Subproblem 5: (x,, x5, x3, x4) = (0, 0, 2, 3), with Z = 133. 
Bound for Subproblem 5: Z < 133. 
LP-relaxation of Subproblem 6: No feasible solutions. 





1 
"a 


Figure 13.9 The solution tree after the second iteration of the MIP. branch-and-bound algorithm for the 
example. 






FQ) 
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2 
Q 0, 2, 1) = incumbent 
= optimal 
solution 


FQ) 


Figure 13.10 The solution tree after the final (third) iteration of the MIP branch-and-bound algorithm 
for the example. 


Subproblem 6 is immediately fathomed by test 2. However, note that Subprob- 
lem 5 also can be fathomed. Test 3 passes because the optimal solution for its LP- 
relaxation has integer values (x, = 0, x, = 0, x = 2) for all three integer-restricted 
variables. (It does not matter that x, = $, since x4 is not integer-restricted.) This 
feasible solution for the original problem becomes our first incumbent: 


Incumbent = (0, 0, 2, 4), with Z* = 133. 


Using this Z* to reapply fathoming test 1 to the only other subproblem (Subproblem 
4) is successful, because its bound of 12% is = Z*. 

This iteration has succeeded in fathoming subproblems in all three possible 
ways. Furthermore, there now are no remaining subproblems, so the current incumbent 
is optimal. 


Optimal solution = (0, 0, 2, 5), with Z = 133. 


These results are summarized by the final solution tree given in Fig. 13.10. 


13.6 Recent Developments 


Integer programming has been an especially exciting area of operations research 
in very recent years because of the dramatic progress being made in its solution 
methodology. 

To place this progress into perspective, consider the historical background. One 
big breakthrough had come in the 1960s and early 1970s with the development and 
refinement of the branch-and-bound approach. But then the state of the art seemed to 
hit a plateau. Relatively small problems (well under a hundred variables) could be 
solved very efficiently, but even a modest increase in problem size might cause an 
explosion in computation time beyond feasible limits. Little progress was being made 
in overcoming this exponential growth in computation time as the problem size was 
increased. Many important problems arising in practice could not be solved. 

Then came the next breakthrough in the mid-1980s, as reported largely in three 
papers published in 1983, 1985, and 1987. (See Selected References 1, 4, and 13.) 
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In the 1983 paper, Harlan Crowder, Ellis Johnson, and Manfred Padberg presented a 
new algorithmic approach to solving pure BIP problems that had successfully solved 
problems with no apparent special structure having up to 2,756 variables! This paper 
won the Lanchester Prize awarded by the Operations Research Society of America 
for the most notable publication in operations research during 1983. In the 1985 paper, 
Ellis Johnson, Michael Kostreva, and Uwe Suhl further refined this algorithmic 
approach. 

However, both of these papers were limited to pure BIP. For IP problems arising 
in practice, it is quite common for all the integer-restricted variables to be binary, but 
a large proportion of these problems are mixed BIP problems. What was critically 
needed was a way of extending this same kind of algorithmic approach to mixed BIP. 
This came in the 1987 paper by Tony Van Roy and Laurence Wolsey of Belgium. 
Once again, problems of very substantial size (up to nearly a thousand binary variables 
and a larger number of continuous variables) were being solved successfully. And 
once again, this paper won a very prestigious award, the Orchard-Hays Prize given 
triannually by the Mathematical Programming Society. l 

We do need to add one note of caution. It is not yet clear just how consistently 
this algorithmic approach can successfully: solve a wide variety of problems of this 
kind of very substantial size. The very large pure BIP problems solved had. sparse A 
matrices; i.e., the. percentage of coefficients in the functional constraints that were 
nonzeroes Was quite small (perhaps less than 5 percent). In fact, the approach depends 
heavily upon this sparsity. (Fortunately, this kind of sparsity. is typical in large practical 
problems.) Furthermore, there are other important factors besides sparsity and size 
that affect just how difficult a given IP problem will be to solve. It appears that IP 
formulations of fairly. substantial size should still be approached with considerable 
caution. l 

On the other hand, each new. algorithmic breakthrough in operations research 
always generates a flurry of new. research activity to try to.develop and refine the new 
approach further. We will undoubtedly see some further fruits of intensified research 
activity in integer programming over the next decade. Perhaps through this research 
the gap in efficiency between integer programming and linear programming algorithms 
can be further closed. , 

Although it would be beyond the scope and level of this book to describe the 
new algorithmic approach fully, we will now give a very brief overview. (You are 
encouraged to read Selected References 1, 4, and 13 for further information.) 

The approach uses a combination of three kinds of techniques: (1) automatic 
problem preprocessing, (2) the generation of cutting planes, and (3) clever branch- 
and-bound techniques. You already are familiar with branch-and-bound techniques, 
and we will not elaborate further on the more advanced versions incorporated here. 
A conceptual introduction to the other two kinds of techniques is given below. 

Automatic. problem preprocessing involves a ‘‘computer.inspection’’ of the user- 
supplied formulation of the IP. problem in order ‘to:spot reformulations that make the 
problem quicker to solve without eliminating any feasible solutions. Some of the ideas 
are very simple, as we will now illustrate with pure binary. variables. It may be 
possible to fix a variable at either 0 or 1 because of some constraint, e.g., 


3x, + 2x32 > x, =0 


3x, — 2x, S -1 > 4 = 0 and x= 1, 


so the variable can then be deleted from the model (after substituting its fixed value). 
A functional constraint may be redundant because of the binary constraints, e.g., 


3x, + 2x, = 6, 


and so can be deleted. It may be possible to tighten a functional constraint by reducing 
some of its coefficients because of the binary constraints, e.g., 


4x, + Sx, tx, = 2 SD 2x, + 2x, + x, 22. 


Doing so has the major advantage of tightening the LP-relaxation (eliminating some 
of its feasible solutions) without eliminating any feasible solutions for the BIP prob- 
lem. Another technique is to prepare a ‘‘trigger’’ that will fix the value of one or 
more variables after another variable has been set at a certain value during the course 
of the algorithm. For example, for the group of variables representing a set of mutually 
exclusive alternatives, as soon as one of the variables is set equal to 1, the rest can 
be fixed at 0. Such a trigger can also be prepared for constraints representing contin- 
gent decisions and other constraints with a similar form. 

Using the computer to implement these and similar ideas automatically has 
proven helpful in accelerating an algorithm. However, an even more important tech- 
nique is the generation of cutting planes. A cutting plane for an IP problem is a new 
constraint that eliminates some feasible solutions for the original LP-relaxation, in- 
cluding its optimal solution, without eliminating any feasible solutions for the IP 
problem. The purpose of adding new constraints with this property is to tighten the 
LP-relaxation, thereby tightening the bound obtained in the bounding step of the 
branch-and-bound technique. Tightening the LP-relaxation also increases the chance 
that its optimal solution will be feasible (and so optimal) for the IP problem, thereby 
improving fathoming test 3. 

To illustrate a cutting plane, consider the California Manufacturing Co. pure 
BIP problem presented in Sec. 13.1 and used to illustrate the BIP branch-and-bound 
algorithm in Sec. 13.4. The optimal solution for its LP-relaxation is given in Fig. 
13.3 as (Xj; X2, X3, X4) = (@, 1, 0, 1). One of the functional constraints is 


6x, + 3x, + 5x, + 2x, = 10. 
Now note that the binary constraints and this constraint together imply that 
xy + Xz + X4 < ox 


This new constraint is a cutting plane. It ‘‘cuts off’ the optimal solution for the LP- 
relaxation (%, 1, 0, 1), but it does not eliminate any feasible integer solutions. Adding 
just this one cutting plane to the original model would improve the performance of 
the BIP branch-and-bound algorithm in Sec. 13.4 (see Fig. 13.7) in two ways. First, 
the optimal solution for the new (tighter) LP-relaxation would be (1, 1, $, 0), with 
Z = 154, so the bounds for the All, x, = 1, and.x, = | nodes now would be 15 
instead of 16. Second, one less iteration would be needed because the optimal solution 
for the LP-relaxation at the x, = O node now would be (1, 1, 0, 0), which provides 
a new incumbent with Z* = 14. Therefore, on the third iteration (see Fig. 13.6), this 
node would be fathomed by test 3 and the x, = 0 node would be fathomed by test 
1, thereby revealing that this incumbent is the optimal solution for the original BIP 
problem. 

The new algorithmic approach presented in Selected References 1, 4, and 13 
involves generating many cutting planes in a similar manner before then applying 
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clever branch-and-bound techniques.. The results of including the cutting planes can 
be quite dramatic in tightening the LP-relaxations. For example, for the test problem 
with 2,756 binary variables considered in the 1983 paper, 326 cutting planes were 
generated. The result was that the gap between Z for the optimal solution for the LP- 
relaxation. of the whole BIP problem and Z for this problem’s optimal solution was 
reduced by 98 percent. Similar results were obtained on about half of the problems. 

Ironically, the very first algorithms developed for integer programming, includ- 
ing Ralph Gomory’s celebrated algorithm announced in 1958, were based on cutting 
planes (generated in a different way), but this approach proved to be unsatisfactory 
in practice (except for special classes of problems). However. these algorithms relied 
solely on cutting planes. We now know that judiciously combining cutting planes and 
branch-and-bound techniques. (along with automatic problem preprocessing). provides 
a powerful algorithmic approach for solving large-scale BIP problems. 


13.7 Conclusions 


IP problems arise frequently because some or all of the decision variables must be 
restricted to. integer values. There also:are many applications involving yes-or-no 
decisions (including combinatorial relationships expressible in terms of such decisions) 
that can be represented by binary (0—1) variables. These problems are more difficult 
than they would be without the integer restriction, so the algorithms available for 
integer programming are generally much less efficient than the simplex method. The 
most important determinants of computation time are the number of integer variables 
and the structure of the problem. For a fixed number of integer variables, BIP prob- 
lems generally are much easier to solve than problems with general integer variables, 
but adding continuous variables (MIP) may not increase computation time substan- 
tially. For special types of BIP problems containing a special structure that can be 
exploited by a special-purpose algorithm, it may be possible to solve very large 
problems (well over a thousand binary variables) routinely. Other much smaller prob- 
lems without such special structure may not be solvable. 

Computer codes for IP algorithms now. are commonly available in mathematical 
programming software packages. These algorithms usually are based on the branch- 
and-bound technique and variations thereof. 

It appears that a new era in IP. solution methodology has now been ushered in 
by a series of landmark papers in the mid-1980s. The new algorithmic approach 
involves combining automatic problem preprocessing, the generation of cutting planes, 
and clever branch-and-bound techniques. Research in this area is continuing. 

IP problems arising in practice sometimes are so large that they cannot be solved 
by even the latest algorithms. In these cases, it is common to simply apply the simplex 
method to the-LP-relaxation and then round the optimal solution to a feasible integer 
solution. However, this approach is sometimes quite unsatisfactory because it may be 
difficult (or impossible) to find a feasible integer solution in this way, and the solution 
found may be far from optimal. This is especially true when dealing with binary 
variables or even general integer variables with small values. 

To circumvent these difficulties with rounding, considerable progress has been 
made in developing efficient heuristic algorithms. Even with very large IP problems, 


these algorithms generally will quickly find very good feasible solutions that are not 
necessarily optimal but usually are better than those that can be found by simple 
rounding. 

In recent years, there has been considerable investigation into the development 
of algorithms for integer nonlinear programming, and this area continues to be a very 
active area of research. 
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PROBLEMS 


1. A young couple, Eve and Steven, want to divide their main household chores (mar- 
keting, cooking, dishwashing, and laundering) between them so that each has two tasks but the 
total time they spend on household duties is kept to a minimum. Their efficiencies on these 
tasks differ, where the time each would need to perform the task is given by the following 
table: 
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Hours Per Week Needed 
Marketing Cooking Dishwashing ~— Laundering 
Eve 4.5 7.8 3.6 2.9 
Steven 4.9 7.2 4.3 3.1 


(a) Formulate a BIP model for this problem. 
(b) Use the BIP branch-and-bound algorithm presented in Sec. 13.4 to solve this prob- 
lem. (Hint: Reduce the model to four binary variables before solving.) 


2. The board of directors of the General Wheels Co. is considering seven large capital 
investments. These investments differ in the estimated long-run profit (net present value) they 
will generate, as well as in the amount of capital required, as shown by the following table (in 
units of millions of dollars): 


Investment Opportunity 
1 2 3 4 5 6 7 


17 10 15 19 7 B 9 
43 28 34 48 17 32 B3 





Estimated profit 
Capital required 





The total amount of capital available for these investments is $100,000,000. Investment op- 
portunities 1 and 2 are mutually exclusive, and so are 3 and 4. Furthermore, neither 3 nor 4 
can be undertaken unless either 1 or 2 is undertaken. There are no such restrictions on investment 


opportunities 5, 6, and 7. The objective is to select the combination of capital investments that 


will maximize the total estimated long-run profit (net present value). 
(a) Formulate a BIP model for this problem. 
(b) Use the BIP branch-and-bound algorithm presented in Sec. 13.4 to solve this prob- 
lem. (Hint: Reduce the problem to solving seven three-variable problems.) 


3. Reconsider Prob. 3, Chap. 7. Now suppose that trucks (and their drivers) need to be 
hired to do the hauling, and each truck can only be used to haul gravel from a single pit to a 
single site. In addition to the hauling and gravel costs specified previously, there now is a fixed 
cost of 5 associated with hiring each truck. A truck can haul 5 tons, but it is not required to 
go full. For each combination of pit and site, there now are two decisions to be made: (1) the 
number of trucks to be used, and (2) the amount of gravel to be hauled. Formulate an MIP 
model for this problem. 


4. Reconsider Prob. 30, Chap. 7. Formulate a BIP model for this problem. 


5. Consider the following special type of shortest path problem (see Sec. 10.3) where 
the nodes are in columns and the only routes considered always move forward one column at 
a time. 





(Destination) 





3 


The numbers along the links represent distances, and the objective is to find the shortest route 
from the origin to the destination. 


This problem also can be formulated as a BIP model involving both mutually exclusive 
alternatives and contingent decisions. Formulate this model. 


6. Consider the following project network for a PERT-type system as described in Sec. 
10.8. 





Formulate a BIP model for the problem of finding a critical path for this project network. 
(Hint: Do not try to apply the definition of ‘‘critical path’’ literally. Instead, use a resulting 
property of a critical path to obtain the. objective function.) 


7. A new planned community is being developed, and one of the decisions to be made 
is where to locate the two fire stations that have been allocated to the community. For planning 
purposes, the community has been divided into five tracts, with no more than one fire station 
to be located in any given tract. Each station is to respond to ail of the fires that occur in the 
tract in which it is located as well as in the other tracts that are assigned to this station. Thus 
the decisions to be made consist of (1) the tracts to receive a fire station and (2) the assignment 
of each of the other tracts to one of the fire stations. The objective is to minimize the overall 
average of the response times to fires. 

The following table gives the average response time (in minutes) to a fire in each tract 
(the columns) if that tract is served by a station in a given tract (the rows). The bottom row 
gives the forecasted average number of fires that will occur in each of the tracts each day. 


Response Times 
Fire in Tract 
1 2 3 4 5 





eared 1 5 12 30 20 15 
saion 2 | 20 4 15 10 25 
located 3| 15 20 6 15 12 
tratt 4,25 15 25 4 10 

5| 10 25 15 12 5 





Frequency of 2 1 3 1 3 
emergencies 





Formulate a complete BIP model for this problem. Identify any constraints that corre- 
spond to mutually exclusive alternatives or contingent decisions. 

8. Reconsider the Middletown racial balance study presented in Sec. 8.5. Suppose that 
the school board changes the current policy by prohibiting the splitting of a tract among different 
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schools, so that an entire tract must be assigned to the same school. However; they will continue 
to require that the fraction of students in a school who are white (or who are black) must be 
between $ and 3. Formulate a BIP model for this problem under this new policy. 


9. Suppose that a state sends R persons to the U.S. House of Representatives. There 
are D counties in the state (D > R), and the state legislature wants to group these counties into 
R distinct electoral districts, each of which sends a delegate to Congress. The total population 
of the state is P, and the legislature wants to form districts whose population approximates 
p = P/R. Suppose that the appropriate legislative committee studying the electoral districting 
problem generates a long list of N candidates to be districts (N > R). Each of these candidates 
contains contiguous counties and a total population p, (j = 1,2, ..., N) that is acceptably 
close to p. Define c; = |p; — p|. Each county i (i = 1, 2,.. ., D) is included in at least one 
candidate and typically will be included in a considerable number of candidates (in order to 
provide many feasible ways of selecting a set of R candidates that includes each county exactly 
once). Define 


aol if county i is included in candidate j 
aj = : 
0, if not. 


Given the values of the c; and the a;;, the objective is to select R of these N possible 
districts such that each county is contained in a single district and such that the largest of the 
associated c; is as small as possible. 

Formulate a BIP model for this problem. 


10.* An airline company is considering the purchase of new long-, medium-, and short- 
range jet passenger airplanes. The purchase price would be $33,500,000 for each long-range 
plane, $25,000,000 for each medium-range plane, and $17,500,000 for each short-range plane. 
The board of directors has authorized a maximum commitment of $750,000,000 for these 
purchases. Regardless of which airplanes are purchased, air travel of all distances is expected 
to be sufficiently large enough so that these planes would be utilized at essentially. maximum 
capacity. It is estimated that the net annual profit (after subtracting capital recovery costs) would 
be $2,100,000 per long-range plane, $1,500,000 per medium-range plane, and $1,150,000 per 
short-range plane. 

It is predicted that enough trained pilots will be available to the company to crew 30 
new airplanes. If only short-range planes were purchased, the maintenance facilities would be 
able to handle 40 new planes. However, each medium-range plane is equivalent to 13 short- 
range planes, and each long-range plane is equivalent to 13 short-range planes in terms of their 
use of the maintenance facilities. 

The information given here was obtained by a preliminary analysis of the problem. A 
more detailed analysis will be conducted subsequently. However, using the preceding data as 
a first approximation, management wishes to know how many planes of each type should be 
purchased to maximize profit. 

(a) Formulate the IP model for this problem. 

(b) Use the MIP branch-and-bound algorithm presented in Sec. 13.5 to solve this prob- 

lem. 

(c) Use a binary representation of the variables to reformulate the IP model in part (a) 

as a BIP problem. 


11. An American professor will be spending a short sabbatical leave at the University 
of Iceland. She wishes to bring all needed items with her on the airplane. After collecting 
together the professional items that she must have, she finds that airline regulations on space 
and weight for checked luggage will severely limit the clothes she can take. (She plans to carry 
on a warm coat, and then purchase a warm Icelandic sweater upon arriving in Iceland.) Clothes 
under consideration for checked luggage include 3 skirts, 3 slacks, 4 tops, and 3 dresses. The 
professor wants to maximize the number of outfits she will have in Iceland (including the special 
dress she will wear on the airplane). Each dress constitutes an outfit. Other outfits consist of a 


combination of a top and either a skirt or slacks. However, certain combinations are not 
fashionable and so will not qualify as an outfit. 
In the following table, the combinations that will make an outfit are marked with an x. 


1 2 3 4 Icelandic Sweater 





Skirt 









Slacks 


The weight (in grams) and volume (in cubic centimeters) of each item is shown in the following 
table: 























Weight | Volume 

1 | 600 5,000 

Skirt 2 450 3,500 
3 | 700 3,000 

1 600 3,500 

Slacks 2 550 6,000 
3 500 | 4,000 

T 350 4,000 

T 2 300 3,500 
2P 3 300 3,000 
4 450 5,000 

1 600 6,000 

Dress 2 700 5,000 
3 800 {| 4000 

Total allowed 4,000 32,000 








Formulate a BIP model to choose which items of clothing to take. (Hint: After using 
binary decision variables to represent the individual items, introduce auxiliary binary variables 
to represent outfits involving combinations of items. Then use constraints and the objective 
function to ensure that these auxiliary variables have the correct values given the values of the 
decision variables.) 


12. The Research and Development Division of a company has been developing four 
possible new product lines. Management must now make a decision as to which of these four 
products actually will be produced and at what levels. Therefore, they have asked the Operations 
Research Department to formulate a mathematical programming model to find the most prof- 
itable product mix. 

A substantial cost is associated with beginning the production of any product, as given 
in the first row of the following table. The marginal net revenue from each unit produced is 
given in the second row of the table. 







Product 





1 2 3 4 
Startup cost $50,000 $40,000 $70,000 $60,000 
Marginal revenue $70 $60 $90 $80 
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Mathematical 1, 2, 3, and 4, respectively. Management has imposed the following policy constraints on these 
Programming variables: 
1. No more than two of the products can be produced. 
2. Either product 3 or 4 can be produced only if either product 1 or 2 is produced. 
3. Either 5x, + 3x, + 6x, + 4x, = 6000 
or 4x, + 6x, + 3x; + 5x, = 6000. 
t 


Introduce auxiliary binary variables to formulate an MIP model for this problem. 
13. Consider the following mathematical model. 
Minimize Z = fi) + f), 
subject to the restrictions 


1. Either x, = 3 or x, = 3. 
2. At least one of the following inequalities holds: 


2x, + x, 27 


Hyp xy eS 


x, + 2x, = 7. 
3. |x, — x| = 0, or 3, or 6. 
4. x, 20, x, =0, 
7 + 5x, ifx, > 0 
where fi) = i 1 if x, = 0 


_ JS + 6x, ifx, > 0 
f0) = fe fx, = 0, 


Formulate this problem as an MIP problem. 
14. Consider the following mathematical model. 
Maximize Z = 3x, + 2f) + 2x; + 3g(x,), 
subject to the restrictions 


1. 2x, = x. + x3 + 3x, = 15. 
2. At least one of the following two inequalities holds: 


xX, + x, + x; + x44 
3x, — xX — x; + x, £3: 
3. At least two of the following four inequalities hold: 
5x, + 3x, + 3x3 — x45 10 
2x, + 5x, — x, + 3x4 =:10 
=x, + 3x, + 5x3 + 3x4 5 10 


3x, — xXx + 3x, + 5x4 10. 


A 


. x3 = 1, or 2, or 3. 
5. x; =0(j = 1,2,3, 4), 


-5 + 3%, ifx, > 0 
where fo) = | 0 P 


and 


a J -3 + Siy ifx, > 0 
g0) = { 0, if x4 = 0. 


Formulate this problem as an MIP problem. 


15. Consider the two-variable IP model discussed in Sec. 13.3 and illustrated in Fig. 


13.1. 
(a) Use a binary representation of the variables to reformulate this model as a BIP 
problem. 
(b) Use the BIP branch-and-bound algorithm presented in Sec. 13.4 to solve this prob- 
lem. 
(c) Use the MIP branch-and-bound algorithm presented in Sec. 13.5 to solve the original 
model. 
16. Consider the following IP problem. 
Maximize Z = 5x, + X 
subject to —x, + 2x,s 4 
Xp 7 X= 1 
4x, + x, = 12 
and x, =0, x, =0 
Xx), X, are integers. 
(a) Solve this problem graphically. 
(b) Solve the LP-relaxation graphically. Round this solution to the nearest integer so- 
lution and check whether it is feasible. Then enumerate all of the rounded solutions 
(rounding each noninteger value either up or down), check them for feasibility, and 
calculate Z for those that are feasible. Are any of these feasible rounded solutions 
optimal for the IP problem? 
(c) Use the MIP branch-and-bound algorithm presented in Sec. 13.5 to solve this prob- 
lem. For each subproblem, solve its LP-relaxation graphically. 
17. Consider the following IP problem. 
Maximize Z = 220x, + 80x3, 
subject to —x, + 2x,= 4 
5x, + 2x, = 16 
2x, - x25 4 
and x, 20, x, 20 


X,, X are integers. 


(a) Solve this problem graphically. 

(b) Solve the LP-relaxation graphically. Round this solution to the nearest integer so- 
lution and check whether it is feasible. Then enumerate all of the rounded solutions 
(rounding each noninteger value either up or down), check them for feasibility, and 
calculate Z for those that are feasible. Are any of these feasible rounded solutions 
optimal for the IP problem? 

(c) Use the MIP branch-and-bound algorithm presented in Sec. 13.5 to solve this prob- 
lem. For each subproblem, solve its LP-relaxation graphically. 
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18.* Consider the assignment problem with the following cost table: 


Assignment 
1 2 3 4 5 





39 65 69 66 57 
64 84 24 92 2 
49 50 61 31 45 
48 45 55 23 50 
59 34 30 34 I$ 


Assignee 


ABW HS 





(a) Design a branch-and-bound algorithm for solving such assignment problems by 
specifying how the branching, bounding, and fathoming steps would be performed. 
(Hint: For the assignees not yet assigned for the current subproblem, form the 
relaxation by deleting the constraints that each of these assignees must perform 
exactly one assignment.) 

(b) Use this algorithm to solve this problem. 


19. Five jobs need to be done on a certain machine. However, the setup time for each 
job depends upon which job immediately preceded it, as shown by the following table: 





Setup Time 
Job 

2 3 4 5 

5 8 9 4 

7 12 10 9 
Immediately — 10 14 H 
preceding job ll — 12? 10 

8 15 — 7 

9 8 16 — 





The objective is to schedule the sequence of jobs that minimizes the sum of the resulting setup 
times. 
(a) Design a branch-and-bound algorithm for sequencing problems of this type by 
specifying how the branch. bound. and fathoming steps would be performed. 
(b) Use this algorithm to solve this problem. 


20.* Consider the following BIP problem. 
Maximize Z = 80x, + 60x. + 40x, + 20x, — (7x, + 5x. + 3x3 + 2x,)°, 
subject to x; is binary, forj = 1, 2, 3, 4. 


Given the value of the first k variables (x,, .. . , x,), where k = 0. 1. 2, or 3, an upper bound 
on the value of Z that can be achieved by the corresponding feasible solutions is 


2 


k k 7 4 k 2 k 2 
by GX 7 (= 4x) + D maxfo, co (3 d;x;+ a) = (> dz || 
j=! j=l ¢ jektt i=] i=} 


where c, = 80, ca = 60, c, = 40, c, = 20, dy = 7, d} = 5, d} = 3, d} = 2. Use this 
bound to solve the problem by the branch-and-bound technique. 








21. Use the BIP branch-and-bound algorithm presented in Sec. 13.4 to solve the fol- 
lowing problem. 


Minimize Z = Sx, + 6x, + 7x; + 8x, + 9x5, 


subject to 3x, — Xp + Xy 4+ ay 2x5 22 497 


te x 2x + x=0 Integer Programming 
~x — xy + 3x, + xyt x;2=l 
and x; is binary, forj = 1,2,...,5. 


; 22.* Use the BIP branch-and-bound algorithm presented in Sec. 13.4 to solve the fol- 
lowing problem. 
Maximize Z = 2x, X + 5x3 — 3x, + 4x, 


subject to 3x, — 2x, + 7x3 — 5x4 + 44556 
X — X, + 2x3 — 4x4 + 2x5=0 
and x is binary, forj = 1,2,...,5. 


23. Use the BIP branch-and-bound algorithm presented in Sec. 13.4 to solve the fol- 
lowing problem. 


Maximize Z = 5x, + Sx. + 8%; — 2x4 — 4s. 


subject to —3x, + 6x, — 7x3 + 9x4 + 9x; = 10 

x, + 2x, — xX - 3x = 0 
and x; is binary, forj = 1,2,...,5. 

24. Consider the following IP problem. 

Maximize Z= —3x, + 5x, 
subject to 5x, — Tx, 23 
and x, = 3 

xj =0 
x; is integer, forj = 1, 2. 


(a) Use the MIP branch-and-bound algorithm presented in Sec. 13.5 to solve this prob- 
lem. For each subproblem, solve its LP-relaxation graphically. 

(b) Use the binary representation for integer variables to reformulate this problem as a 
BIP problem. 

(c) Use the BIP branch-and-bound algorithm presented in Sec. 13.4 to solve the problem 
as formulated in part (b). 


25. Use the MIP branch-and-bound algorithm presented in Sec. 13.5 to solve the fol- 
lowing problem. (For each subproblem, solve its LP-relaxation graphically.) 


Minimize Z = 2x, + 3x2, 


subject to x) + x, 23 
x, + 3x, 26 
and x, 20, x, = 0 


X; X2 are integers. 
26. Use the MIP branch-and-bound algorithm presented in Sec. 13.5 to solve the fol- 
lowing MIP problem. 
Maximize Z = Sx, + 4x, + 4x, + 2x4, 
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subject to xX, + 3x, + 2x3 + x, 10 
5x; + x, + 3x, + 2x4, = 15 
X, + Xt x + x45 6 

and x 20, forj = 1, 2,3, 4 
x; is integer, for j = 1,2, 3. 


27. Use the MIP branch-and-bound algorithm presented in Sec. 13.5 to solve the fol- 
lowing MIP problem. 


Maximize Z = 3x, + 4x, + 2x; + x4 + 2x5, 
subject to 2X, — x + x +x, +t X53 
=x, + 3x, + x3 — x4 — 2x; =2 
2x, + X, — x; + x4 + 3x; 1 
and x = 0, forj = 1,2,3, 4,5 
x; is binary, forj = 1, 2, 3. 


28. Use the MIP branch-and-bound algorithm presented in Sec. 13.5 to solve the fol- 
lowing MIP problem. 


Minimize Z = 5x, +x + x3 + 2x4 + 3x5, 


subject to Xy — 5x3 +X, + 2x5 = +32 

5x, — Xz + x52 7 

xy + x + 6x3 + x4 = 4 

and x; 20, forj = 1,2,3,4,5 
x; is integer, forj = 1, 2, 3. 


29. The optimal solution for the LP-relaxation of a certain four-variable pure BIP prob- 
lem is (x,, X2, x3, X4) = (3, 1, 3, 4). One of several functional constraints for this problem is 


3x, + 5x, + 4x3 + 8x, = 10. 


Use this constraint to generate two valid cutting planes for the problem. 


14 


Nonlinear Programming 


The fundamental role of linear programming in operations research is accurately 
reflected by the fact that it is the focus of seven chapters of this book, and it is used 
in several other chapters. A key assumption of linear programming is that all its 
functions (objective function and constraint functions) are linear. Although this as- 
sumption essentially holds for numerous practical problems, it frequently does not 
hold. In fact, many economists have found that some degree of nonlinearity is the 
rule and not the exception in economic planning problems.! Therefore it often is 
necessary to deal directly with nonlinear programming problems, so we turn our 
attention to this important area. 

In one general form,” the nonlinear programming problem is to find x = 
(Xis X2, . . . , Xa) SO as to 

Maximize f(x), 


! For example, see Baumol, W. J., and R. C. Bushnell: ‘‘Error Produced by Linearization in Mathematical 
Programming,” Econometrica, 35:447—471, 1967. 


? The other legitimate forms correspond to those for linear programming listed in Sec. 3.2. Section 4.6 
describes how to convert these other forms into the form given here. 
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subject to gx) = b; fori = 1,2,...,m 
and x=0, 


where f(x) and the g,(x) are given functions of the n decision variables. ! 

No algorithm that will solve every specific problem fitting this format is avail- 
able. However, substantial progress has been made for some important special cases 
of this problem by making various assumptions about these functions, and research 
is continuing very actively. This area is a large one, and we do not have space to 
survey it completely. However, we do present a few sample applications and then 
introduce some of the basic ideas for solving certain important types of nonlinear 
programming problems. 

Both Appendixes 1 and 2 provide useful background for this chapter, and we 
recommend that you review these appendixes as you study the next few sections. 


14.1 Sample Applications 


The following examples illustrate a few of the many important types of Brevis to 
which nonlinear programming has been applied. 


The Product Mix Problem with Price Elasticity 


In product mix problems, such as the Wyndor Glass Co. problem of Sec. 3.1, the 
goal is to determine the optimal mix of production levels for a firm’s products, given 
limitations on the resources needed to produce those products, in order to maximize 
the firm’s total profit. In some cases, there is a fixed unit profit associated with each 
of the products, so the resulting objective function will be linear. However, in many 
product mix problems, certain factors introduce nonlinearities into the objective func- 
tion. For example, a large manufacturer may encounter price elasticity, whereby the 
amount of a product that can be sold has an inverse relationship to the price charged. 
Thus the price-demand curve might look like the one shown in Fig. 14.1, where p(x) 








p(x) 4 
3 
E 
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Figure 14.1 Price-demand curve. 


1 For simplicity, we assume throughout the chapter that all of these functions are differentiable everywhere. 


is the price required in order to be able to seli x units. If the unit cost for producing 
the product is fixed at c (see the dashed line in Fig. 14.1), the firm’s profit for 
producing and selling x units is given by the nonlinear function, 


P(x) = x p(x) — cx, 


as plotted in Fig. 14.2. If each of the firm’s n products has a similar profit function, 
say P.(x;) for producing and selling x; units of product j (j = 1,2,...,~), then 
the overall objective function is 


fa) = > Pi(x;), 
E 


a sum of nonlinear functions. 

Another reason that nonlinearities can arise in the objective function is due to 
the fact that the marginal cost of producing another unit of a given product varies 
with the production level. For example, the marginal cost may decrease when the 
production level is increased because of a learning curve effect (more efficient pro- 
duction with more experience). On the other hand, it may increase instead because 
special measures such as overtime or more expensive production facilities may be 
needed to increase production further. 

Nonlinearities also may arise in the g;(x) constraint functions in a similar fash- 
ion. For example, if there is a budget constraint on total production cost, the cost 
function will be nonlinear if the marginal cost of production varies as just described. 
For constraints on the other kinds of resources, g;(x) will be nonlinear whenever the 
use of the corresponding resource is not strictly proportional to the production levels 
of the respective products. 


The Transportation Problem with Volume Discounts on Shipping Costs 


As illustrated by the P & T Company example in Sec. 7.1, a typical application of 
the transportation problem is to determine an optimal plan for shipping goods from 
various sources to various destinations, given supply and demand constraints, in order 
to minimize total shipping cost. It was assumed in Chap. 7 that the cost per unit 


a 


Profit 
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Amount 
Figure 14.2 Profit function. 
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shipped from a given source to a given destination is fixed, regardless of the amount 
shipped. In actuality, this cost may not be fixed. Volume discounts sometimes are 
available for large shipments, so that the marginal cost of shipping one more unit 
might follow a pattern like the one shown in Fig. 14.3. The resulting cost of shipping 
x units then is given by a nonlinear function C(x), which is a piecewise linear function 
with slope equal to the marginal cost, like the one shown in Fig. 14.4. Consequently, 
if each combination of source and destination has a similar shipping cost function, so 
that the cost of shipping x; units from source i (i = 1, 2, ... , m) to destination j 
(j = 1, 2, .. . , n) is given by a nonlinear function CX) then the overall objective 
function to be minimized is 


jx) = > > Cy). 
i=l j= 


Even with this nonlinear objective. function, the constraints normally are still the 
special linear constraints that fit the transportation problem model in Sec. 7.1. 


Portfolio Selection with Risky Securities 


It now is common practice for professional managers of large stock portfolios to use 
computer models based partially on nonlinear programming to guide them. Because 
investors are concerned about both the expected return (gain) and the risk associated 
with their investments, nonlinear programming is used to determine a portfolio that, 
under certain assumptions, provides an optimal trade-off between these two factors. 
A nonlinear programming model can be formulated for this problem as follows. 
Suppose that n stocks (securities) are being considered for- inclusion in the portfolio, 





and let the decision variables x (j= 1,2,...,) be the number of shares of stock 
j to be included. Let u, and g; be the (estimated) mean and variance of the return on 
each share of stock j, where o;; measures the risk of this stock. For i = 1, 2,...,7 
A 
18.6 — 
6.5 
5 13.2 
$ 5 — A | 
= © 
g Q 
5 4 — E 
$ & gab 





3.9 
pe wih as pei 
0.6 1.5 2.7 4.5 0.6 1.5 2.7 4.5 
Amount shipped Amount shipped 


Figure 14.3 Marginal shipping cost. Figure 14.4 Shipping cost function. 


(i # j), let o; be the covariance of the return on one share each of stock i and stock 
j. (Because it would be difficult to estimate all of the o;, the usual approach is to 
make certain assumptions about market behavior that enable us to calculate o; directly 
from g; and o;;.) Then the expected value R(x) and the variance V(x) of the total 
return from the entire portfolio are 


R(x) = > HX; 
£ 


n n 
V(x) = 5 > OypXiX; 
Ge] j=l 


where V(x) measures the risk associated with the portfolio. The device used to consider 
the trade-off between these two factors is to combine them together in the objective 
function to be maximized, 


f(x) = R(x) — BVO), 


where the parameter $ is a nonnegative constant that reflects the investor’s desired 
trade-off between expected return and risk. Thus choosing 6 = 0 implies that risk 
should be ignored completely, whereas choosing a large value for 6 places a heavy 
weight on minimizing risk [by maximizing the negative of V(x)]. 

The complete nonlinear programming model might be 


Maximize f(x) = 2 ux- B > 2 OyXiX;> 
j= i=l j= 


subject to y Px; =B 
j=l 
and x; 20, forj = 1,2,...,4, 


where P, is the price for each share of stock j and B is the amount of money budgeted 
for the portfolio. Under certain assumptions about the investor’s utility function 
(measuring the relative value to the investor of different total returns), it can be shown 
that an optimal solution for this nonlinear programming problem maximizes the inves- 
tor’s expected utility.! 

One drawback of the preceding formulation is that, because R(x) and V(x) are 
somewhat incommensurable, it is relatively difficult to choose an appropriate value 
for B. Therefore, rather than stopping with one choice of GB, it is common to use a 
parametric (nonlinear) programming approach to generate the optimal solution as a 
function of £ over a wide range of values of 8. The next step is to examine the values 
of R(x) and V(x) for these solutions that are optimal for some value of 8, and then 
choose the solution that seems to give the best trade-off between these two quantities. 
This procedure often is referred to as generating the solutions on the efficient frontier 
of the two-dimensional graph of (R(x), V(x)) points for feasible x. The reason is that 
the (R(x), V(x)) point for an optimal x (for some £) lies on the frontier (boundary) 
of the feasible points. Furthermore, each optimal x is efficient in the sense that no 
other feasible solution is at least equally good with one measure (R or V) and strictly 
better with the other measure (smaller V or larger R). 


' See Selected Reference 2, pp. 21-22, for further details. 
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14.2 Graphical Illustration of Nonlinear 
Programming Problems 


When a nonlinear programming problem has just one or two variables, it can be 
represented graphically much like the Wyndor Glass Co. example for linear program- 
ming in Sec. 3.1. Because such a graphical representation gives considerable insight 
into the properties of optimal solutions for linear and nonlinear programming, let us 
look at a few examples. In order to highlight these differences, we shall use some 
nonlinear variations of the Wyndor Glass Co. problem. 

Figure 14.5 shows what happens to this problem if the only changes in the 
model shown in Sec. 3.1 are that both the second and third functional constraints are 
replaced by the single nonlinear constraint, 9x} + 5x3 = 216. Compare Fig. 14.5 
with Fig. 3.3. The optimal solution still happens to be (x,, x») = (2, 6). Furthermore, 
it still lies on the boundary of the feasible region. However, it is not a comer-point 
feasible solution. The optimal solution could have been a corner-point feasible solution 
with a different objective function (check Z = 3x, + xy), but the fact that it need 
not be one means that we no longer have the tremendous simplification used in linear 
programming of limiting the search for an optimal solution to just the corner-point 
feasible solutions. 

Now suppose that the linear constraints of Sec. 3.1 are kept unchanged, but the 
objective function is made nonlinear. For example, if 


Z = 126x, — 9x? + 182x, — 13x2, 


then the graphical representation in Fig. 14.6 indicates that the optimal solution is 
x, = $, x) = 5, which again lies on the boundary of the feasible region. (The value 
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Figure 14.5 Wyndor Glass Co. example with a nonlinear constraint. 
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Figure 14.6 Wyndor Glass Co. example with a nonlinear objective function. 





of Z for this optimal solution is Z = 857, so Fig. 14.6 depicts the fact that the locus 
of all points with Z = 857 intersects the feasible region at just this one point, whereas 
the locus of points with any larger Z does not intersect the feasible region at all.) On 
the other hand, if 

Z = 54x, — 9x? + 78x, — 13x3, 


then the optimal solution turns out to be (x, x2) = (3, 3), which lies inside the 
boundary of the feasible region. (You can check that this solution is optimal by using 
calculus to derive it as the unconstrained global maximum; because it also satisfies 
the constraints, it must be optimal for the constrained problem.) Therefore, a general 
algorithm for solving similar problems needs to consider all solutions in the feasible 
region, not just those on the boundary. 

Another complication that arises in nonlinear programming is that a local max- 
imum need not be a global maximum (the overall optimal solution). For example, 
consider the function of a single variable plotted in Fig. 14.7. Over the interval 
0 = x <5, this function has three local maxima—x = 0, x = 2, x = 4—but only 
one of these, x = 4, is a global maximum. (Similarly, there are local minima at 
x = 1, 3, and 5, but only x = 5 is a global minimum.) 

Nonlinear programming algorithms generally are unable to distinguish between 
a local maximum and a global maximum (except by finding another better local 
maximum). Therefore. it becomes crucial to know the conditions under which any 
local maximum is guaranteed to be a global maximum over the feasible region. You 
may recall from calculus that when we maximize an ordinary (doubly differentiable) 
function of a single variable f(x) without any constraints, this guarantee can be given 
when 





d? 
A <0 for all x. 
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Figure 14.7 A function with several local maxima. 
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Figure 14.8 Examples of (a) concave function, (b) convex function. 


Such a function that is always ‘‘curving downward’’ (or not curving at all) is called 
a concave function.' Similarly, if < is replaced by =, so that the function is always 
“curving upward’’ (or not curving at all), it is called a convex function.” (Thus a 
linear function is both concave and convex.) See Fig. 14.8 for examples. Then note 
that Fig. 14.7 illustrates a function that is neither concave nor convex because it 
alternates between curving upward and curving downward. 

Functions of multiple variables also can be characterized as concave or convex 
if they always curve downward or curve upward. For example, consider a function 
consisting of a sum of terms. If each term is concave (we can check to see if it is 
from its second derivative when the term involves just one of the variables), then the 
function is concave. Similarly, the function is convex if each term is convex. These 
intuitive definitions are restated in precise terms, along with further elaboration on 
these concepts, in Appendix 1. 

If a nonlinear programming problem has no constraints, the objective function 
being concave guarantees that a local maximum is a global maximum. (Similarly, the 
objective function being convex ensures that a local minimum is a global minimum.) 
If there are constraints, then one more condition will provide this guarantee—namely, 
that the feasible region is a convex set. As discussed in Appendix 1, a convex set is 
simply a set of points such that, for each pair of points in the collection, the entire 
line segment joining these two points is also in the collection. Thus the feasible region 


1 Concave functions sometimes are referred to as concave downward. 
? Convex functions sometimes are referred to as concave upward. 


for the Wyndor Glass Co. problem in Fig. 3.3 (or the feasible region for any otlier 
linear programming problem) is a convex set. Similarly; the feasible region in Fig. 
14.5 is a convex set, which occurs whenever all of the g,(x) [for the constraints 
gx) < bj] are convex. Therefore, to guarantee that a local maximum is a global 
maximum for a nonlinear programming problem with constraints g(x) = b; (i = 
1,2,..., m) and x = 0, the objective function f(x) must be concave and each g(x) 
must be convex. 


14.3 Types of Nonlinear Programming Problems 


Nonlinear programming problems come in many different shapes and forms.. Unlike 
the simplex method for linear programming, no single algorithm that will solve all of 
these different types of problems. exists. Instead, algorithms have been developed for 
various individual classes (special types) of nonlinear programming problems. The 
most important classes are introduced next, and then the subsequent sections describe 
how some problems of these types can be solved. 


Unconstrained Optimization 


Unconstrained optimization problems have no constraints, so the objective is simply 
Maximize fa) 


over all values of x = (x;, X2, .. . , X„). As reviewed in Appendix 2, the necessary 
condition that a particular solution x = x* be optimal when f(x) is a differentiable 
function is 


—=0 at x = x*, forj=1,2,...,n. 


When f(x) is concave, this condition also is sufficient, so then solving for x* reduces 
to solving the system of n equations obtained by setting the n partial derivatives equal 
to zero. Unfortunately, for nonlinear functions f(x), thesé equations often are going 
to be nonlinear as well, in which case you are unlikely to be able to solve analytically 
for their simultaneous solution. What then? Sections 14.4 and 14.5 describe algo- 
rithmic search procedures for finding x*, first for n = 1 and then for n > 1. These 
procedures also play an important role in solving many of the problem types described 
next, where there are constraints. The reason is that many algorithms for constrained 
problems are designed so that they can focus on an unconstrained version of the 
problem during a portion of each iteration. 

When a variable x; does have a nonnegativity constraint, x; = 0, the preceding 
necessary and (perhaps) sufficient condition changes slightly to 


afso atx =x*,  ifxt = 0 


ax, |= 0 atx = x*, ifx* >0 


for each such j. A problem that has some such nonnegativity constraints but no 
functional constraints is one special case (m = 0) of the next class of problems. 
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Linearly Constrained Optimization 


Linearly constrained optimization problems are characterized by constraints that com- 
pletely fit linear programming, so that all of the g,(x) constraint functions are linear, 
but the objective function f(x) is nonlinear. The problem is considerably simplified 
by having just one nonlinear function to take into account, along with a linear pro- 
gramming feasible region. A number of special algorithms based upon extending the 
simplex method to consider the nonlinear objective function have been developed. 
One important special case, which we consider next, is quadratic programming. 


Quadratic Programming 


Quadratic programming problems again have linear constraints, but now the objective 
function f(x) must be quadratic. Thus, the only difference between them and a linear 
programming problem is that some of the terms in the objective function involve the 
square of a variable or the product of two variables. 

Many algorithms have been. developed for this case under the additional as- 
sumption that f(x) is concave. Section 14.7 presents an algorithm that involves a 
direct extension of the simplex method. 

Quadratic programming is very important, partially because such formulations 
arise naturally in many applications. For example, the problem of portfolio. selection 
with risky securities described in Sec. 14.1 fits into this format. However, another 
major reason for its importance is that a common approach to solving general linearly 
constrained optimization problems is to solve a sequence of quadratic programming 
approximations. 


Convex Programming 


Convex programming covers a broad class of problems that actually encompasses as 
special cases all of the preceding types when f(x) is concave. The assumptions are: 


1. f(x) is concave, 
2. Each g,(x) is convex. 


As discussed at the end of Sec. 14.2, these assumptions are enough to ensure that a 
local maximum is a global maximum. You will see in Sec. 14.6 that the necessary 
and sufficient conditions for such an optimal solution are a natural generalization of 
the conditions just given for unconstrained optimization and its extension to include 
nonnegativity constraints. Section 14.9 then describes algorithmic approaches to solv- 
ing convex programming problems. 


Separable Programming 


Separable programming is a special case of convex programming, whete the one 
additional assumption is: 


3. All of the f(x) and g;(x) functions are separable functions. 


A separable function is simply a function where each term involves just a single 
variable, so that the function is separable into a sum of functions. of individual 
variables. For example, if f(x) is a separable:function, it can be expressed as 


f(x) = È Fœ), 


where each f;(x;) function includes only the terms involving just x;. In the terminology 
of linear programming (see Sec. 3.3), separable programming problems satisfy the 
assumption of additivity but not the assumption of proportionality (for nonlinear 
functions). 

It is important to distinguish these problems from other convex programming 
problems, because any separable programming problem can be closely approximated 
by a linear programming problem so that the extremely efficient simplex method can 
be used. This approach is described in Sec. 14.8. (For simplicity, we focus there on 
the linearly constrained case where the special approach is needed only on the objec- 
tive function.) 


Nonconvex Programming 


Nonconvex programming encompasses all nonlinear programming problems that do 
not satisfy the assumptions of convex programming. Now, even if you are successful 
in finding a local maximum, there is no assurance that it also will be a global maximum. 
Therefore, there is no algorithm that will guarantee finding an optimal solution for all 
such problems. However, there do exist some algorithms that are relatively well suited 
for finding local maxima, especially when the forms of the nonlinear functions do not 
deviate too strongly from those assumed for convex programming. One such algorithm 
is presented in Sec. 14.10. 

However, certain specific types of nonconvex programming problems can be 
solved without great difficulty by special methods. Two especially important such 
types are discussed briefly next. 


Geometric Programming 


When we apply nonlinear programming to engineering design problems, the objective 
function and the constraint functions frequently take the form 


N 


g(x) = > GP), 


i= 


where PAX) = xix.. o. Xin, fori = 1,2,..., N. 


In such cases, the c; and a, typically represent physical constants and the x; are design 
variables. These functions generally are neither convex nor concave, so the techniques 
of convex programming cannot be applied directly to these geometric programming 
problems. However, there is one important case where the problem can be transformed 
into an equivalent convex programming problem. This case is where all of the c; 
coefficients in each function are strictly positive, so that the functions are generalized 
positive polynomials (now called posynomials), and the objective function is to be 
minimized. The equivalent convex programming problem with decision variables 
Yis Yoo «+» » Yn iS then obtained by setting 


x = i, forj = 1,2,..., 7 
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throughout the original model, so now a convex programming algorithm can be ap- 
plied. Alternative solution procedures also have been developed for solving these 
posynomial programming problems, as well as for geometric programming problems 
of other types.! 


Fractional Programming 


Suppose that the objective function is in the form of a fraction, i.e., the ratio of two 
functions, 


f(x) 
f(x), 


Such fractional programming problems arise, for example, when maximizing the ratio 
of output to man-hours expended (productivity), or profit to capital expended (rate of 
return), or expected value to standard deviation of some measure of performance for 
an investment portfolio (return/risk). Some special solution procedures have been 
developed for certain forms of f,(x) and f,(x).” 

When it can be done, the most straightforward approach to solving a fractional 
programming problem is to transform it into an equivalent problem of a standard type 
for which effective solution procedures already are available. To illustrate, suppose 
that f(x) is of the linear fractional programming form 





Maximize f(x) = 


where c and d are row vectors, x is a column vector, ard cy and dọ are scalars. Also 
assume that the constraint functions g;(x) are linear, so that the constraints in matrix 
form are Ax = b and x = 0. 

Under mild additional assumptions, we can transform the problem into an equiv- 
alent linear programming problem by letting 


x 1 


=e d = Ses 
yr axt+d S dx + dy 


so that x = y/t. This result yields 


Maximize Z = cy + Cot, 


subject to Ay- bt = 0 
dy + dot = 1, 
and y= 0, t= 0, 


' Duffin, Richard J., Elmur L. Peterson, and Clarence M. Zehner: Geometric Programming, Wiley, New 
York, 1967; Beightler, Charles, and Donald T. Phillips: Applied Geometric Programming, Wiley, New 
York; 1976. 


? The pioneéring work on fractional. programming was done by Charnes, A., and W. W. Cooper: 
“Programming with Linear Fractional Functionals, Naval Research Logistics Quarterly, 9:181-186, 
1962. Also see Schaible, Siegfried: ‘‘A Survey of Fractional Programming,’’ in Schaible, Siegfried, and 
William T. Ziemba (eds.), Generalized Concavity in Optimization and Economics, Academic Press, New 
York, 1981, pp. 417-440. 


which can be soived by the simplex method. More generally, the same kind of trans- 
formation can be used to convert a fractional programming problem with concave 
f,(x), convex f,(x), and convex g;(x) into an equivalent convex programming 
problem. 


The Complementarity Problem 


When we deal with quadratic programming in Sec. 14.7, you will see one example 
of how solving certain nonlinear programming problems can be reduced to solving 
the complementarity problem. Given variables w,, w3, . . . , Wp and Zi, %, - - - 5 Zp» 
the complementarity problem is to find a feasible solution for the set of constraints, 


w = F(z), w= 0, z= 0, 
that also satisfies the complementarity constraint, 
w'z = 0. 


Here, w and z are column vectors, F is a given vector-valued function, and the 
superscript T denotes transpose (see Appendix 3). The problem has no objective 
function, so technically it is not a full-fledged nonlinear programming problem. It is 
called the complementarity problem because of the complementary relationships that 
either 


w = 0 or z= 0 (or both) for each i = 1,2,...,p. 
An important special case is the linear complementarity problem, where 
F(z) = q + Mz, 


where q is a given column vector and M is a given p X p matrix. Efficient algorithms 
have been developed for solving this problem under suitable assumptions about the 
properties of the matrix M.! One type involves pivoting from one basic feasible 
solution to the next, much like the simplex method for linear programming. 

In addition to having applications in nonlinear programming, complementarity 
problems have applications in game theory, economic equilibrium problems, and 
engineering equilibrium problems. 


14.4 One-Variable Unconstrained Optimization 


We now begin discussing how to solve some of the types of problems just described 
by considering the simplest case —unconstrained optimization with just a single vari- 
able x (n = 1), where the differentiable function f(x) to be maximized is concave.” 
Thus the necessary and sufficient condition for a particular solution x = x* to be 
optimal (a global maximum) is 


—=0 atx = x*, 


' See Cottle, R. W., and G. B. Dantzig: ‘Complementary Pivot Theory of Mathematical Programming,”’ 
Linear Algebra and Its Applications, 1:103-125, 1966, and Murty, K. G.: Linear and Combinatorial 
Programming, Wiley, New York, 1976, chap. 16. 


2 See the beginning of Appendix 2 for a review of the corresponding case when f(x) is not concave. 
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Figure 14.9 The one-variable unconstrained programming problem when the function is concave. 


as depicted in Fig. 14.9. If this equation can be solved directly for x*, you are done. 
However, if f(x). is not a particularly simple function, so the derivative is not just a 
linear or quadratic function, you may not be able to solve the equation analytically. 
If not, the one-dimensional search procedure provides a straightforward way of solv- 
ing the problem numerically. 


The One-Dimensional Search Procedure 


Like other search procedures in nonlinear programming, the one-dimensional search 
procedure finds. a sequence of trial solutions that leads toward an optimal solution. 
At each iteration, you begin at the current trial solution to conduct a systematic search 
that culminates by identifying a new improved (hopefully substantially improved) trial 
solution. 

The idea behind the one-dimensional. search procedure is a very. intuitive one, 
namely, that whether the slope (derivative) is positive or negative at a trial solution 
definitely indicates whether this solution needs to be made larger or.smaller to move 
toward an optimal solution. Thus, if the derivative evaluated at a particular value of 
x is positive, then x* must be larger than this x (see Fig. 14.9), so this x becomes a 
lower bound on the trial solutions that need: to be considered thereafter. Conversely, 
if the derivative is negative, then x* must be smaller than this x, so x would become 
an upper bound. Therefore, after both types of bounds have been identified, each new 
trial solution selected between the current. bounds provides a new. tighter bound of 
one type, thereby narrowing the search further. As long as a reasonable rule is used 
to select each trial solution in this way, the resulting sequence of trial solutions must 
converge to x*. In practice, this means. continuing the sequence until: the distance 
between the bounds is sufficiently small that the next trial solution must be within a 
prespecified error tolerance of x*. 

This entire process is summarized next, using the notation 


x’ 


current trial. solution, 


ll 


x = current lower bound on x*, 


x = current upper bound on x*, 


error tolerance for x*. 


E€ 


Although there are several reasonable rules for selecting each new trial solution, the 
one used in the following procedure is the midpoint rule (traditionally called the 
Bolzano search plan), which says simply to select the midpoint between the two 
current bounds. 


Summary of One-Dimensional Search Procedure 


Initialization Step: Select e. Find an initial x and x by inspection (or by respectively 
finding any value of x at which the derivative is positive and then negative). Select 
an initial trial solution, 








io ee 
XxX perni 
2 
d 
Iterative Step: 1. Evaluate me atx = x’. 
ré 
2; a ) = 0, reset x = x’. 
d 
3. (#2 reset X¥ = x’. 
x +x 





4. Select anew x’ = 5 


Stopping Rule: If (x — x) = 2e, so the new x’ must be within € of x*, stop. Otherwise, 
return to the iterative step. 


We shall now illustrate this procedure by applying it to the following example. 


EXAMPLE: Suppose that the function to be maximized is 
f(x) = 12x ~ 3x* — 2x6, 

as plotted in Fig. 14.10. Its first two derivatives are 
af (x) _ 


dx = 120 — x — x), 
7] 
ae = —12(3x7 + 5x*). 


Because the second derivative is nonpositive everywhere, f(x) is a concave function, 
so the one-dimensional search procedure can be applied safely to find its global max- 
imum. A quick inspection of this function (without even constructing its graph as 
shown in Fig. 14.10) indicates that f(x) is positive for small positive values of x, but 
it is negative for x < 0 or x > 2. Therefore, x = O and x = 2 can be used as the 
initial bounds, with their midpoint, x’ = 1, as the initial trial solution. Let 0.01 be 
the error tolerance for x* in the stopping rule, so the final @ — x) = 0.02 with the 
final x’ at the midpoint. Applying the one-dimensional search procedure then yields 
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Figure 14.10 Example for one-dimensional search procedure. 


Table 14.1 Application of One-Dimensional Search Procedure to 
































Example 
Iteration 
1 7.0000 
1 0.5 5.7812 
2 0.5 0.75 7.6948 
3 0.75 1 0.875 7.8439. 
4 0.75 0.875 0.8125 7.8672 
5 0.8125 0.875 0.84375 7.8829 
6 0.8125 0.84375 | 0.828125 7.8815. 
F 0.828125 | 0.84375 | 0.8359375 








the sequence of results shown in Table 14.1. [This table includes both the function 
and derivative values for your information, where the derivative is evaluated at the 
trial solution generated at the preceding iteration. However, note that the algorithm 
actually doesn’t need to calculate f(x’) at all and that it only needs to calculate the 
derivative far enough to determine its sign.] The conclusion is that 


x* = 0.836, 
0.828125 < x* < 0.84375. 


14.5 Multivariable Unconstrained Optimization 


Now consider the problem of maximizing a concave function f(x) of multiple vari- 
ables, x = (x, X3, . . . , Xp), when there are no constraints on the feasible values. 
Suppose again that the necessary and sufficient condition for optimality, given by the 
system of equations obtained by setting the respective partial derivatives equal to zero 


(see Sec. 14.3), cannot be solved analytically, so that a numerical search procedure 
must be used. How can the preceding one-dimensional search procedure be extended 
to this multidimensional problem? 

In Sec. 14.4, the value of the ordinary derivative was used to select one of just 
two possible directions (increase x or decrease x) in which to move from the current 
trial solution to the next one. The goal was to reach a point eventually where this 
derivative is (essentially) zero. Now, there are innumerable possible directions in 
which to move; they correspond to the possible proportional rates at which the re- 
spective variables can be changed. The goal is to reach a point eventually where 
all of the partial derivatives are (essentially) zero. Therefore, extending the one- 
dimensional search procedure requires using the values of the partial derivatives to 
select the specific direction in which to move. This selection involves using the gra- 
dient of the objective function, as described next. 

Because the objective function f(x) is assumed to be differentiable, it possesses 
a gradient denoted by Vf(x) at each point x. In particular, the gradient at a specific 
point x = x’ is the vector whose elements are the respective partial derivatives 
evaluated at x = x’, so that 


Vi(x’) = (2 oF eae di 2) atx = x’. 


ka kd > 
OX, OX, OX, 


The significance of the gradient is that the (infinitesimal) change in x that maximizes 
the rate at which f(x) increases is the change that is proportional to Vf(x). To express 
this idea geometrically, the ‘‘direction’’ of the gradient, Vf(x'), is interpreted as the 
direction of the directed line segment (arrow) from the origin (0, 0, . . . , 0) to the 
point (Af/0x,, 0f/dx,,... , Af /x,), where df /dx, is evaluated at x, = xj. There- 
fore, it may be said that the rate at which f(x) increases is maximized if (infinitesimal) 
changes in x are in the direction of the gradient Vf(x). Because the objective is to 
find the feasible solution maximizing f(x), it would seem expedient to attempt to move 
in the direction of the gradient as much as possible. 


The Gradient Search Procedure 


Because the current problem has no constraints, this interpretation of the gradient 
suggests that an efficient search procedure should keep moving in the direction of the 
gradient until it (essentially) reaches an optimal solution x*, where Vf(x*) = 0. 
However, it normally would not be practical to change x continuously in the direction 
of Vf(x), because this series of changes would require continuously reevaluating the 
af /dx, and changing the direction of the path. Therefore, a better approach is to keep 
moving in a fixed direction from the current trial solution, not stopping until f(x) stops 
increasing. This stopping point would be the next trial solution, so the gradient then 
would be recalculated to determine the new direction in which to move. With this 
approach, each iteration involves changing the current trial solution x’ as follows: 


Reset x = x + VER), 
where f* is the positive value of ¢ that maximizes f(x’ + t Vf(x')); that is, 


f(x’ + * VfR) = max f(x’ + t VF(x’)). 
1=0 
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[Note that f(x! +. t Vf(x')) is simply f(x), where . 


ð 
y= a e e (2) ; forj = 1,2,...,n, 
OX; =x 
and that these expressions for the x, involve only constants and t, so f(x) becomes a 


function of just the single variable t.] The iterations of this gradient search procedure 
continue until Vf(x) = 0 within a small tolerance e, that is, until 


af 


OX; 


Se for allj = 1,2,....,n. 








An analogy may help to clarify this procedure. Suppose that you need to climb 
to the top of a hill. You are near-sighted, so you can’t see the top of the hill in order 
to walk directly in that direction. However, when you stand still, you can see the 
ground around your feet well enough to determine the direction in which the hill is 
sloping upward most sharply. You are able to walk in a straight line. While walking, 
you also are able to tell when you stop climbing (zero slope in your direction). 
Assuming that the hill is concave, you now can use the gradient search procedure 
for climbing to the top efficiently. This problem is a two-variable problem, where 
(x1, X2) represents the coordinates (ignoring height) of your current location. The 
function f(x,, x.) gives the height of the hill at (x,, x,). You start each iteration at 
your current location (current trial.solution) by determining the direction [in the (x,, x,) 
coordinate system] in which the hill is sloping upward most sharply (the direction of 
the gradient) at this point. You then begin walking in this fixed direction and continue 
as long as you still are climbing. You eventually stop at a new trial location (solution) 
when the hill becomes level in your direction, at which point you prepare to do another 
iteration in another direction. You continue these iterations, following a zigzag path 
up the hill, until you reach a trial location where the slope is essentially zero in all 
locations. Under the assumption that the hill [f(x,, x,)] is concave, you must then be 
essentially at the top of the hill. 

The most difficult part of the gradient search procedure usually is to find t*, the 
value of t that maximizes f in the direction of the gradient, at each iteration. Because 
x and Vf(x) have fixed values for the maximization, and because f(x) is concave, 
this problem should be viewed as maximizing a concave function of a single variable 
t. Therefore, it can be solved by the one-dimensional search procedure of Sec. 14.4 
(where the initial lower bound on t must be nonnegative because of the t = 0 con- 
straint). Alternatively, if f is a simple. function, it may be possible to obtain an 
analytical solution by setting the derivative with respect to t equal to zero and solving. 


Summary of Gradient Search Procedure 


Initialization Step: Select ¢ and any initial trial solution x’. Go first to the stopping 
rule. 


Iterative Step: 1. Express f(x’ + t Vf(x')) as a function of t by setting 


ð 
y= t (2) s forj=1,2,...,n, 
ð x=xX 


X 


and then substituting these expressions into f(x). 


2. Use the one-dimensional search procedure (or calculus) to find 517 


t = f* that maximizes f(x’ + t Vf(x’)) over t = 0. Nonlinear 
3. Reset x‘ = x’ + t* Vf(x’). Then go to the stopping rule. Programming 
Stopping Rule: Evaluate Vf(x') at x = x’. Check if 
ð 
Chi ee for all j = 1,2,...,n. 
OX; 








_If so, stop with the current x’ as the desired approximation of an optimal solution x*. 
Otherwise, go to the iterative step. 


Now let us illustrate this procedure. 


EXAMPLE: Consider the following two-variable problem. 
Maximize f(x) = 2x,x) + 2x, — x] — 2x5. 


af 


Thus — = 2x, — 2x1, 
Ox, 
ð 
EE EE 
ðX 


We also can verify (see Appendix 1) that f(x) is concave. To begin the gradient search 
procedure, suppose that x = (0, 0) is selected as the initial trial solution. Because 
the respective partial derivatives are 0 and 2 at this point, the gradient is 


VFf(O, 0) = (0, 2). 
Therefore, to begin the first iteration, set 
x, =O + ¢(0) = 0, 
xX, = 0 + (2) = 2t, 
and then substitute these expressions into f(x) to obtain 
fx’ + t VFO) = FO, 21) 
= 20X2) + 2(21) — (0)? — 2(28)? 


= 4t — 87. 
Because fO, 2%) = max fO, 2) = max {4t — 8t”, 
= = 
and a (4t — 8°) = 4 — 16t = 0, 
dt 
it follows that i = 4, 
so Reset x’ = (0,0) + 400, 2) = @, 2). 


For this new trial solution, the gradient is 


VF, 2) = (1, 0). 


518 


Mathematical 
Programming 


Thus for the second iteration, set 
x = (0, 3) + ¢(1, 0) = (t, 3), 


so f(x’ + ¢ VPf(x’)) = FO +243 + OF = fi,» 
= 206) + 28) - r - 267 
=t—-2 +4. 

Because f(a) = max f(t, 3) = max t- P +a, 


d 
E a 1a 0, 


then t* = 


jee 


so Reset x! — (0, 3) + 4d, 0) = G, 3). 


A nice way of organizing this work is to write out a table such as Table 14.2, 
which summarizes the preceding two iterations. At each iteration, the second column 
shows the current trial solution, and the last column shows the eventual new trial 
solution, which then is carried down into the second column for the next iteration. 
The fourth column gives the expressions for the x; in terms of ¢ that need to be 
substituted into f(x) to give the fifth column. 

Continuing in this fashion, the subsequent trial solutions would be (3, 9), È, Ð. 
4. ). G, , . .. , as shown in Fig. 14.11. Because these points are converging to 
x* = (1, 1), this solution is the optimal solution, as verified by the fact that 


Vfd, 1) = (0, 0). 


However, because this converging sequence of trial solutions never reaches its limit, 
the procedure actually will stop somewhere (depending on €) slightly below (1, 1) as 
its final approximation of x*. 

As Fig. 14.11 suggests, the gradient search procedure zigzags to the optimal 
solution rather than moving in a straight line. Some modifications of the procedure 
have been developed that accelerate movement toward the optimum by taking this 
zigzag behavior into account. 

If f(x) were not a concave function, the gradient search procedure still would 
converge to a local maximum. The only change in the description of the proce- 
dure for this case is that r* now would correspond to the first local maximum of 
f(x’ + t Vf(x’')) as t is increased from zero. 

If the objective were to minimize f(x) instead, one change in the procedure 
would be to move in the opposite direction of the gradient at each iteration. In other 
words, the rule for obtaining the next point now would be 


Reset x = x’ —  VP(x’). 


Table 14.2 Application of Gradient Search Procedure to Example 


|x| vr | x + rvs | fee + evra 


0, 2) (0, 21) 4t — 82 
(1, 0) PEN 










Iteration x + * Vf’) 
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(0, 0) 


Figure 14.11 Illustration of the gradient search procedure. 


The only other change is that z* now would be the nonnegative value of t that minimizes 
f(x’ — t Vf(x’)); that is, 


fx’ — * Vf(x’)) = min f(x’ — t Vf(x’)). 
t=0 


14.6 The Karush-Kuhn-Tucker (KKT) Conditions 
for Constrained Optimization 


We now focus on the question of how to recognize an optimal solution for a nonlinear 
programming problem (with differentiable functions). What are the necessary and (per- 
haps) sufficient conditions that such a solution must satisfy? 

In the preceding sections we already have noted these conditions for uncon- 
strained optimization, as summarized in the first two lines of Table 14.3. Early in 


Table 14.3 Necessary and Sufficient Conditions for Optimality 





Also Sufficient 









Necessary Conditions 





















Problem for Optimality if 

One-variable df s0 FŒ) concave 

unconstrained dx 
Multivariable of ; 

—=0 a er n 

unconstrained Ox; G ) TENSEI 
Constrained, of 26 

nonnegativity ax; = 

constraints G=1,2,..., n) | f(x) concave 


only 















General 
constrained 
problem 


Karush-Kubn-Tucker f(x) concave 
conditions g,(x) convex 
G@ = 1,2,...,m) 
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Sec. 14.3 we also gave these conditions for the slight extension of unconstrained 
optimization where the only constraints are nonnegativity constraints. These conditions 
are shown in the third line of Table 14.3 in another equivalent form that is suggestive 
of their generalization for general constrained optimization. As indicated in the last 
line of the table, the conditions for the general case are called the Karush-Kuhn- 
Tucker conditions (or KKT conditions), because they were derived independently 
by Karush! and by Kuhn and Tucker.” Their basic result is embodied in the following 
theorem. 


THEOREM: Assume that f(x), g(x), g.(X), -. . , &,(%) are differentiable functions 
satisfying certain regularity conditions.” Then 
x* = (xf, ts ax) 


can be an optimal solution for the nonlinear programming problem only if there exist 


m numbers, uj, Uz, ... , Um, such that all of the following conditions are satisfied: 
ð Z 0g; 
1. a X u Bi <9 
0x; = OX; 
atx = x*, forj = 1,2,..., 7. 
ð m ð f 
2. s (%- Š u %) =0 
Ox; i=l Ox; 
Se ae EN fori = 1,2,...,m 
4. ulg(x*) — bj] = 0 
5. x} = 0, forj = 1,2,...,n. 
6. u; = 0, fori = 1,2,...,m. 


In the preceding KKT conditions, the u; correspond to the dual variables of 
linear programming (we expand on this correspondence at the end of the section), and 
they have a comparable economic interpretation. (However, the u; actually arose in 
the mathematical derivation as Lagrange multipliers.) Conditions 3 and 5 do nothing 
more than ensure the feasibility of the solution. The other conditions eliminate most 
of the feasible solutions as possible candidates to be an optimal solution. However, 
it should be noted that satisfying these conditions does not guarantee that the solution 
is optimal. As summarized in the last column of Table 14.3, certain additional con- 
vexity assumptions are needed to obtain this guarantee. These assumptions are spelled 
out in the following extension of the theorem. 


COROLLARY: Assume that f(x) is a concave function and that g,(x), 
BK), - - - > 8m(X) are convex functions (i.e., this problem is a convex programming 
problem), where all of these functions satisfy the regularity conditions. Then x* = 


1 Karush, W.: ‘‘Minima of Functions of Several Variables with Inequalities as Side Conditions,” M.S. 
Thesis, Department of Mathematics, University of Chicago, 1939. 

? Kuhn, H. W., and A. W. Tucker: ‘‘Nonlinear Programming,”’ in Jerzy Neyman (ed.), Proceedings of 
the Second Berkeley Symposium, University of California Press, Berkeley, 1951, pp. 481-492. 

3 Ibid., p. 483. 


(xt, x3, ... , x) is an optimal solution if and only if all the conditions of the theorem 
are satisfied. 


EXAMPLE: To illustrate the formulation and application of the KKT conditions, we 
consider the following two-variable nonlinear programming problem. 
Maximize fa) = In@, + 1) + x, 
subject to 2x, +X. =3 
and x, 20, xX, = 0, 


where In denotes natural logarithm. Thus m = 1 and g) = 2x, + x2, 80 g(x) is 
convex. Furthermore, it can be easily verified (see Appendix 1) that f(x) is concave. 
Hence the corollary applies, so any ‘solution that satisfies the KKT conditions will 
definitely be an optimal solution. These conditions are: 





1(a). = 2u, = 0. 


x, +1 


1 
Xa). X (- a 20) = 0. 





1(6). 1 -— u =0. 
2(6). x(1 — u) = 0. 
3. 2x, + x» S3. 
4. u,(2x, + x, — 3) = 0. 
5. x, 20, x, 2 0. 
6. u = 0. 


The steps in solving the KKT conditions for this particular example are outlined 
below. 


1. u; = 1, from condition 1(b). 
x, = 0, from condition 5. 





. Therefore, i 2u, < 0. 


xy 
. Therefore, x, = 0, from condition 2(a). 
. u, # 0 implies that 2x, + x, — 3 = 0, from condition 4. 
. Steps 3 and 4 imply that x, = 3. 
- X # 0 implies that u, = 1, from condition 2(d). 
. No conditions are violated by x, = 0, x, = 3, u, = 1. 


Naum bw N 


Therefore, there exists a number u; = 1 such that x, = 0, x, = 3, and u, = 1 satisfy 
all the conditions. Consequently, x* = (0, 3) is an optimal solution for this problem. 

The particular progression of steps needed to solve the KKT conditions will 
differ from one problem to the next. When the logic is not apparent, it is sometimes 
helpful to consider separately the different cases where each x; and u; is specified 
to be either = 0 or > 0, and then trying each case until one leads to a solution. In 
the example, there are eight such cases corresponding to the eight combinations of 
x, = O versus x, > 0, x, = O versus x, > 0, and u) = O versus u; > 0. Each case 
leads to a simpler statement and analysis of the conditions. To illustrate, consider first 
the case shown next, where x, = 0, x, = 0, and u; = 0. 
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KKT Conditions for the Case (x, = 0, x, = 0, uy = 0) 








1(a). - 020. iction. 

(a) TS 0=0 Contradiction 

16). 1-00. Contradiction. 
3,.04+0s3. 


(All the other conditions are redundant.) 


As listed below, the other three cases where u; = 0 also give immediate con- 
tradictions in a similar way, so no solution is available. 


Case (x, = 0, x, > 0, u, = 0): Contradicts conditions 1(a), 1(b), and 2(b). 

Case (x, > 0, x, = 0, u, =0): Contradicts conditions 1(a), 2(a), and 1(b). 

Case (x, > 0,.x, > 0, m = 0): Contradicts conditions 1(a), 2(a), 1(b), and 
2(b). 


The case (x, > 0, x, > 0, u, > 0) enables deleting these nonzero multipliers from 
conditions 2(a), 2(b), and (4), which then enables deleting conditions 1(a), 1(b), and 
(3) as redundant, as summarized below. 


KKT Conditions for the Case (x, > 0, x, > 0, u; > 0) 








Xa). 2u, = 0. 


x, +1 T 
Xb). 1 — u = 0. 

4. 2x, +5 -3=0. 
(All the other conditions are redundant.) 


Therefore, u) = 1, sox, = —%, which contradicts x > 0. 
Now suppose that the case (x, = 0, x, > 0, u > 0) is tried next. 


KKT Conditions for the Case (x, = 0, x, > 0, u, > 0) 





1 
1(a). Gor 2u, = 0. 


2(b). 1 — u = 0. 
40+x%-3=0. 
(All the other conditions are redundant.) 


Therefore, x, = 0,x, = 3,u, = 1. 

Having found a solution, no additional cases need be considered. 

For problems more complicated than this example, it may be difficult, if not 
essentially impossible, to derive an optimal solution directly from the KKT conditions. 
Nevertheless, these conditions still provide valuable clues as to the identity of an 
optimal solution, and they also permit us to check whether a proposed solution may 
be optimal. 

There also are many valuable indirect applications of the KKT conditions. One 
of these applications arises in the duality theory that has been developed for nonlinear 
programming to parallel the duality theory for linear programming presented in Chap. 
6. In particular, for any given constrained maximization problem (call it the primal 
problem), the KKT conditions can be used to define a closely associated dual problem 
that is a constrained minimization problem. The variables in the dual problem consist 
of both the Lagrange multipliers, u; (i = 1, 2,..., m), and the primal variables, xX; 


(j= 1,2,....,n).' In the special case where the primal problem is a linear pro- 
gramming problem, the x, variables drop out of the dual problem and it becomes the 
familiar dual problem of linear programming (where the u; variables here correspond 
to the y; variables in Chap. 6). When the primal problem is a convex programming 
problem, it is possible to establish between the primal problem and the dual problem 
relationships that are similar to those for linear programming. For example, the strong 
duality property of Sec. 6.1, which states that the optimal objective function values 
of the two problems are equal, also holds here. Furthermore, the values of the w; 
‘variables in an optimal solution for the dual problem can again be interpreted as 
shadow prices (see Secs. 4.7 and 6.2); i.e., they give the rate at which the optimal 
objective function value for the primal problem could be increased by (slightly) in- 
creasing the right-hand side of the corresponding constraint. Because duality theory 
for nonlinear programming is a ile advanced topic, he interested reader is 
referred elsewhere for further information.? 
You will see another indirect application of the KKT conditions in the next 
section. 


14.7 Quadratic Programming 


As indicated in Sec. 14.3, the guadratic programming problem differs from the linear 
programming problem only in that the objective function also includes x? and x;x; 
(i # j) terms. Thus, if we use matrix notation like that introduced at the beginning 
of Sec. 5.2, the problem is to find x so as to 


Maximize f(x) = ex — 2x'Qx, 
subject to Ax =b and x= 0, 


where c is a row vector, x and b are column vectors, Q and A are matrices, and the 
superscript T denotes transpose (see Appendix 3). The q; (elemențs of Q) are given 
constants such that q; = q; (which is the reason for the factor of % in the objective 
function). By performing the indicated vector and matrix multiplications, the objective 
function then is expressed in terms of these q;;, the c; (elements of c), and the variables 
as follows: 





FX) = ex = 4x7 Qx = D cyx — #2 2 i 
= SE 


If i = j in this double summation, then x;x; = x7, so —2q, is the coefficient of x. 
If i # j, then — 3G %;%; + qux) = — Gy X;X;, SO — qy is the total coefficient for 
the product of x; and x;. 

To illustrate this notation, consider the following example of a quadratic pro- 
gramming problem. 


1 For details on this formulation, see Mangasarian, Olvi T.: Nonlinear Programming, McGraw-Hill, New 
York, 1969, chap 8. For a unified survey of various approaches to duality in nonlinear programming, see 
Geoffrion, A. M.: ‘‘Duality in Nonlinear Programming: A Simplified Applications-Oriented Development,”’ 
SIAM Review, 13:1-37, 1971. 


? Ibid. 
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Maximize = f(x, x») = 15x, + 30x, + 4x; = 2x? — 4x3, 


subject to x, + 2x, = 30 
and x, = 0, xX, = 0. 
In this case, 


e= [15 30], x= EI Q = E p 


A=[i 2], b = [30]. 
Note that 


4 —4|[x 
-4x"Qx = -4i a|- i ‘| I] = Axx, — 2x2 — 4x3. 


Several algorithms have been developed for the special case of the quadratic 
programming problem where the objective function is a concave function. (A way to 
verify that the objective function is concave is to verify the equivalent condition that 


x'Qx = 0 


for all x, that is, Q is a positive semidefinite matrix.) We shall describe one! of these 
algorithms (the modified simplex. method). that has been quite popular because it re- 
quires using only the simplex method with a slight modification. The. key to this 
approach is to construct the KKT conditions from the preceding section, and then to 
reexpress these conditions in a convenient form that closely resembles linear program- 
ming. Therefore, before describing the algorithm, we shall develop this convenient 
form. 


The KKT Conditions for Quadratic Programming 


For concreteness, let us first consider the above example. Starting with the form given 
in the preceding section, its KKT conditions are the following. 


l(a). 15 + 4x, — 4x, — wu, = 0. 
2(a). x15 + 4x, — 4x, — um) = 0. 
1(b). 30 + 4x, — 8x. — 2u, = 0. 
2(b). x30 + 4x, — 8x, — 2u,) = 0. 

3. x, + 2x, — 30=0. 

4. u(x, + 2x, — 30) = 0. 

5. x, 20, x 20. 

6. u = 0. 


To begin reexpressing these conditions in a more convenient form, we move 
the constants in conditions 1(a), 1(b), and 3 to the right-hand side, and then introduce 
nonnegative slack variables (denoted by y,, Y2, and v;, respectively) to convert these 
inequalities to equations. 


1 Wolfe, Philip: ‘‘The Simplex Method for Quadratic Programming,’’ Econometrics, 27:382-398, 1959. 
This paper develops both a short form and a long form of the algorithm. We present a version of the short 
form, which assumes further that either ¢ = 0 or the objective function is strictly concave. 


l(a). —4x, + 44, u + yy = —15 
1(b). 4x, — 8x, — 2u + yo = —30 
3. x; + 2x, +v = 30 


Having introduced y,, note that condition 2(a) can now be reexpressed as simply 
requiring that either x, or 0 or y, = 0; that is, 


2(a). x,y, = 0. 
In just the same way, conditions 2(b) and 4 can be replaced by 
2(b). xy. = 0, 
4. uv = 0. 
For each of these three pairs—(x,, y,), (%2, Y2), (Uy, Vı)—the two variables are called 
complementary variables, because only one of the two variables can be nonzero. 
Since all six variables are required to be nonnegative, these new forms of conditions 
2(a), 2(b), and 4 can be combined into one constraint, 
XY) + X22 + uW, = 0, 


called the complementarity constraint. 

After multiplying through the equations for conditions 1(a) and 1(b) by (— 1) 
to obtain nonnegative right-hand sides, we now have the desired convenient form for 
the entire set of conditions shown below. 


4x, — 4%. + uy y = 15 
—4x, + 8x- + 2u aei Y2 = 30 
x, + 2x, + v, = 30 
x, 20, x, 20, u; = 0, y, 20, y2 Z0, v 20 


XY + Xy + uv, = 0 


This form is particularly convenient because, except for the complementarity con- 
straint, these conditions are linear programming constraints. 

For any quadratic programming problem, its KKT conditions can be reduced to 
this same convenient form containing just linear programming constraints plus one 
complementarity constraint. Using matrix notation again, this general form is 


Qx + Au -y= c, 
Ax + v = b, 
x=0, u=0, y=0, v=0, 
x'y + u'v = 0, 


where the elements of the column vector u are the u; of the preceding section, and 
the elements of the column vectors y and v are slack variables. 

Because the objective function of the original problem is assumed to be concave 
and because the constraint functions are linear and therefore convex, the corollary to 
the theorem of Sec. 14.6 applies. Thus x is optimal if, and only if, there exist values 
of y, u, and v such that all four vectors together satisfy all these conditions. The 
original problem is thereby reduced to the equivalent problem of finding a feasible 
solution to these constraints. 
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It is of interest to note that this equivalent problem is one example of the linear 
complementarity problem. introduced in Sec. 14.3 (see Prob. 13), and that.a key 
constraint for the linear complementarity problem is its complementarity constraint. 


The Modified Simplex Method 


The modified simplex method exploits the key fact that, with the exception of the 
complementarity constraint, the KKT conditions in the convenient form obtained 
above are nothing more than linear programming constraints. Furthermore, the com- 
plementarity constraint simply implies that it is not permissible for both complemen- 
tary variables of any pair to be (nondegenerate) basic variables (the only variables 
> 0) when considering (nondegenerate) basic feasible solutions. Therefore, the prob- 
lem reduces to finding an initial basic feasible solution to any linear programming 
problem that has these constraints, subject to this additional restriction on the identity 
of the basic variables. (This initial basic feasible solution may be the only feasible 
solution in this case.) 

As we discussed in Sec. 4.6, finding such an initial basic feasible solution is 
relatively straightforward. In the simple case where cT = 0 (unlikely) and b = 0, the 
initial basic variables are the elements of y and v (multiply through the first set of 
equations by —1), so that the desired solution is x = 0, u = 0, y = —c', 
v = b. Otherwise, you need to revise the problem by introducing an artificial variable 
into each of the equations where c; > 0 (add the variable on the left) or b; < 0 
(subtract the variable on the left and then multiply through by — 1) in order to use 
these artificial variables (call them z,, z3, and so on) as initial basic variables for the 
revised problem. (Note that this choice of initial basic variables satisfies the comple- 
mentarity constraint, because as nonbasic variables x = 0 and u = 0 automatically.) 

Next, use phase 1 of the two-phase method (see Sec. 4.6) to find a basic feasible 
solution for the real problem; i.e., apply the simplex method (with one modification) 
to the following linear programming problem. 


Minimize Z= Èz, 
i 


subject to the linear programming constraints obtained from the KKT conditions, but 
with these artificial variables included. 

; The one modification in the simplex. method is: the following change in the 
procedure for selecting an entering basic variable. 


RESTRICTED ENTRY RULE: When choosing an entering basic variable, exclude 
from consideration any nonbasic variable whose complementary variable already is a 
basic variable; the choice should be made from among the other nonbasic variables 
according to the usual criterion for the simplex method. 


This rule keeps the complementarity constraint satisfied throughout the course 
of the algorithm. When an optimal solution 
x*, u*, y*, ve, ral = 0, RS Ae AN: Zn bis 0 


is obtained for the phase 1 problem, x* is the desired optimal solution for the original 
quadratic programming problem. Phase 2 of the two-phase method is not needed. 


EXAMPLE: We shall now illustrate this approach on the example given at the be- 
ginning of the section. As can be verified from the results in Appendix 1 [see Prob. 
43(a)], f(x,, X2) is strictly concave; that is, 


o-[3 


is positive definite, so the algorithm can be applied. 

The starting point for solving this example is its KKT conditions in the con- 
venient form obtained earlier in the section. After introducing the needed artificial 
variables, the linear programming problem to be addressed explicitly by the modified 
simplex method then is 


Minimize Z= zi + 2, 


subject to 4x, ~ 4%, + Uy - yy +z = 15 
—4x, + 8x, + 2u, - yy + 2 = 30 
X, + 2x, + VU, = 30 

and 


x, 20, x» 20, u,2=0, y, 20, y 20, v 20, 2,20, 72,20. 
The additional complementarity constraint, 
XY) + X2 + uv, = 0, 


is not included explicitly, because the algorithm automatically enforces this constraint 
because of the restricted entry rule. In particular, for each of the three pairs of 
complementary variables —(x,, yı), >, Y2), (ti, V1)— whenever one of the two vari- 
ables already is a basic variable, the other variable is excluded as a candidate to be 
the entering basic variable. Remember that the only nonzero variables are basic vari- 
ables. Because the initial set of basic variables for the linear programming problem— 
Zi» Z2, Vy — gives an initial basic feasible solution that satisfies the complementarity 
constraint, there is no way that this constraint can be violated by any subsequent basic 
feasible solution. 

Table 14.4 shows the results of applying the modified simplex method to this 
problem. The first simplex tableau exhibits the initial system of equations after con- 
verting from minimizing Z to maximizing (—Z) and algebraically eliminating the 
initial basic variables from Eq. (0), just as was done for the radiation therapy example 
in Sec. 4.6. The three iterations proceed just as for the regular simplex method, except 
for eliminating certain candidates to be the entering basic variable because of the 
restricted entry rule. In the first tableau, u, is eliminated as a candidate because its 
complementary variable (v,) already is a basic variable (but x, would have been chosen 
anyway because —4 < —3). In the second tableau, both u; and y, are eliminated as 
candidates (because v, and x, are basic variables), so x, automatically is chosen as 
the only candidate with a negative coefficient in row 0 (whereas the regular simplex 
method would have permitted choosing either x, or u, because they are tied for having 
the largest negative coefficient). In the third tableau, both y, and y, are eliminated 
(because x, and x, are basic variables). However, u, is not eliminated because v, 
no longer is a basic variable, so u, is chosen as the entering basic variable in the 
usual way. 
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Table 14.4. Application of Modified Simplex Method to Quadratic Programming Example 









































; Basic Eq. Right 
Iteration Variable | No. Xy Xa J2 v, zi Side 
z 0 0 -4 -3 1 0 0 0 | -45 
ó z 1 rt 1 0 o 1 0 15 
z2 2 -4 8 2 -1 0 0 1 30] 
v, 3 1 2 0 0 0 1 0 0 30 
z | 0 -2 0 -2 1 4+ 0 0 + | -30 
i z 1 E 0 2 -1 > 0 1 4| 30 
X 2 -4 1 ł o 4 0 0 ¿l 3 
vu j3 2) 0 -4 0 4 1 T 2 
Z 0 0 0 -3 1 3 1 0 tl) -7 
2 z l o o Ja) -1 -i -1 1 3 7 
xX 2 0 1 è 0 -5 4 0 Te is 
x 3 1 0 j-i 0 è 3 0 =} 113 
z |o 0 0 0 0 o o0 1 1 0 
u 1 o 0 1 -$ -ġġ -3 š% Š 3 
% 2 0 1 0 a -ð b -b 9 
xy 3 1 0 0 =$ a 3 w -5 12 











The resulting optimal solution for this phase / problem is x, = 12, x, = 9, 
u; = 3, with the rest of the variables zero. [Problem 43(c) asks you to verify that 
this solution is optimal by showing that x, = 12, x, = 9, u; = 3 satisfy the KKT 
conditions for the original problem when they are written in the form given in Sec. 
14.6.] Therefore, the optimal solution for the quadratic programming problem (which 
includes only the x, and x, variables) is (x,, x2) = (12, 9). 


14.8 Separable Programming 





The preceding section showed how one class of nonlinear programming problems can 
be solved by an extension of the simplex. method. We now consider another class, 
called. separable programming, that actually can be solved by the simplex method 
itself, because any such problem can be approximated as closely as desired by a linear 
programming problem with a larger number of variables. 

As indicated in Sec. 14.3, separable programming assumes that the objective 
function f(x) is concave, that each of the constraint functions g,(x) is convex, and 
that all of these functions are separable functions (functions. where each term involves 
just a single variable). However, in order. to simplify the discussion, we focus here 
on the special case where the convex and separable g,(x) are, in fact, linear functions, 
just as for linear programming. Thus only the objective function requires special 
treatment. 

Under the preceding assumptions, the objective function can be expressed as a 
sum. of concave functions of individual variables, 


fx) = D fi), 
< 


so that each f,(x;) has a shape such as the one shown in Fig. 14.12 (either case) over 
the feasible range of values of x! Because f(x) represents the measure of performance 


1 f(x) is concave if and only if every fi) is concave. 


Case i 
Sx) is concave and piecewise linear 


Six) 











Pi 

a 
2 Pr 
= 

© 

i} 

£ 

© 
E 
E Pi 

2 
= 
S; (slope) 
sole X 
a 
0 Uj Uj + Uj È Uji (Level of activity j) 
niet ef 
Case 2 
fk) S(x;) ìs just concave 
Fi 3 
Pa 

T 

D? p, 

E Pr 

3 

i] 

5 

p= — x) 

E Pap Fo eee Approximation of f,(x,) 
a 


0 uj Un + Uy È Uj (Level of activity j) 


po ap 


Figure 14,12 Shape of profit curves for separable programming. 
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(say profit) for all of the activities together, f;(x;) represents the contribution to profit 
from activity j when it is conducted at the level x; The condition of f(x) being 
separable simply implies additivity (see Sec. 3.3); i.e., there are no interactions 
between the activities (no cross-product terms) that affect total profit beyond their 
independent contributions. The assumption that each f;(x;) is concave says that the 
marginal profitability (slope of the profit curve) either stays the same or decreases 
(never increases) as x; is increased. 

Concave profit curves occur quite frequently. For example, it may be possible 
to sell a limited amount of some product at a certain price, and then a further amount 
at a lower price, and perhaps finally a further amount at a still lower price. Similarly, 
it may be necessary to purchase raw materials from increasingly expensive sources. 
Another common situation is where a more expensive production process must be 
used (e.g., overtime rather than regular-time work) to increase the production rate 
beyond a certain point. 

These kinds of situations can lead to either type of profit curve shown in Fig. 
14.12. In case 1, the slope decreases only at certain breakpoints, so that f,(x;) is a 
piecewise linear function (a sequence of connected line segments). For case 2, the 
slope may decrease continuously as x, increases, so that f;(x;) is a general concave 
function. Any such function can be approximated as closely as desired by a piecewise 
linear function, and this kind of approximation is used as needed for separable pro- 
gramming problems. (Figure 14.12 shows an approximating function that consists of 
just three line segments, but the approximation can be made even better just by 
introducing additional breakpoints.) This approximation is very convenient because a 
piecewise linear function of a single variable can be rewritten as a linear function of 
several variables, with one special restriction on the values of these variables, as 
described next. 


Reformulation as a Linear Programming Problem 


The key to rewriting a piecewise linear function as a linear function is to use a separate 
variable for each line segment. To illustrate, consider the piecewise linear function 
f;(x;) shown in Fig. 14.12, case 1 (or its approximation for case 2), which has three 
line segments over the feasible range of values of x,. Introduce the three new vari- 
ables—x;, X;2, x;3—and set 

Xx; = Xj + Xj2 + Xiz ` 

where 0 S x) = uy, 0 =X = Up, 0 S x3 S Uj. 


Then use the slopes— sj, S;2, 5;3—to rewrite f;(x;) as 
fŒ) = SXF SX + S3Xj3 
with the special restriction that 
Xj. = 0 whenever Xj < Uy, 
Xz = 0 whenever Xjq < Uj. 


To see why this special restriction is required, suppose that x, = 1, where up > 1 
(k = 1, 2, 3), so that f;(1) = sp. Note that — 


tX tT Ha =l 


permits Xil =] š Xiz = 0, X3 5 0 > Ff) = Sits 
Xj = 0, Xiz = 1, X53 0 > fA) 7 S25 
% = 90, x2 = 0, x3 = 1 > FAD = sa 


and so on, where Si > Sip > Sj 


Il 


- However, the special restriction permits only the first possibility, which is the only 
one giving the correct value for f;(1). 

Unfortunately, the special restriction does not fit into the required format for 
linear programming constraints, so some piecewise linear functions cannot be rewritten 
in a linear programming format. However, our f;(x;) are assumed to be concave, so 
Sj > Sp >... , SO that an algorithm for maximizing f(x) automatically gives the 
highest priority to using x; when (in effect) increasing x, from zero, the next highest 
priority to using x, and so on, without even including the special restriction explicitly 
in the model. This observation leads to the following key property. 


KEY PROPERTY OF SEPARABLE PROGRAMMING: When f(x) and the g,x) satisfy 
the assumptions of separable programming, and when the resulting piecewise linear 
functions are rewritten as linear functions, deleting the special restriction gives a 
linear programming model whose optimal solution automatically satisties the special 
restriction. l 


We shall elaborate further on the logic behind this key property later in this 
section in the context of a specific example. [Also see Prob. 54(a).] 

To write down the complete linear programming model using the above notation, 
let n; be the number of line segments in f;(x;) (or the piecewise linear function 
approximating it), so that 


would be substituted throughout the original model and’ 
nj 
HE) = È Sa 
k=] 


would be substituted into the objective function for j = 1,2,..., n.! The resulting 
model is 


Maximize Z= 5 (š sata): 
j=l 


k=] 
subject to 
n nj 
Sa (3 x) =o fori = 1,2,..., m 
i=l k=1 
X S Up fork = 1,2,..., Aj and j=1,2,...,A, 
and Xx = 0, fork =1,2,...,n, and j=41,2,...,m. 


' If one or more of the f{x;) already are linear functions, fj) = cx then n; = 1 so neither of these 
substitutions would be made for the j. 
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(The 2, Xg = 0 constraints are deleted because they are ensured by the x, = 0 
constraints.) If some original variable x, has no upper bound, then Un, = ©, SO the 
constraint involving this quantity would be deleted. 

The most efficient way of solving this model is to use the streamlined version 
of the simplex method for dealing with upper bound constraints mentioned at the end 
of Sec. 7.5 (and described in Sec. 9.1). After obtaining an optimal solution for this 


model, you then would calculate 


ty 
x; = >, Xiks 
k=1 


forj = 1,2, ..., min order to identify an optimal solution for the original separable 
programming program (or its piecewise linear approximation). 


Example 


The Wyndor Glass Co. (see Sec. 3.1) has received a special order for handcrafted 
goods to be made in Plants 1 and 2 throughout the next 4 months. Filling this order 
will require borrowing certain employees from the work crews for the regular products, 
so the remaining workers would need to work overtime to utilize the full production 
capacity of the plant’s machinery and equipment for these regular products. In par- 
ticular, for the two new regular products discussed in Sec. 3.1, overtime would be 
required to utilize the last 25 percent of the production capacity available in Plant 1 
for product 1, and for the last 50 percent of the capacity available in Plant 2 for 
product 2. The additional cost of using overtime work would reduce the profit for 
each unit involved from $3 to $2 for product 1, and from $5 to $1 for product 2, 
giving the profit curves of Fig. 14.13, both of which fit the form for case 1 of Fig. 
14.12. 
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Rate of production Rate of production 
Figure 14.13 Profit data during the next 4 months for the Wyndor Glass Co. 


Management has decided to go ahead and use overtime work rather than hire 
additional workers during this temporary situation. However, it does insist that the 
work crew for each product be fully utilized on regular time before any overtime is 
used. Furthermore, it feels that the current production rates (x, = 2 for product 1 and 
Xx, = 6 for product 2) should be changed temporarily if this would improve overall 
profitability. Therefore, it has instructed the OR Department to review products 1 and 
2 again to determine the most profitable product mix during the next 4 months. 


FORMULATION: At first glance it may appear straightforward to modify the Wyndor 
Glass Co. linear programming model in Sec. 3.1 to fit this new situation. In particular, 
let the production rate for product 1 be x, = X,p + Xio, where x), is the production 
rate achieved on regular time and x;ọ is the incremental production rate from using 
overtime. Define x, = Xə + Xzọ in the same way for product 2. Thus n = 2, n, = 
2, and nm, = 2 in the preceding general model. The new linear programming problem 
is to determine the values of xik, X19, X2p, X20 SO aS to 


Maximize Z = 3X ip + 2X19 + SxXop + X20, 
subject to XIR = 3 
1 
6 
6 
3p + Xio) + Ux + Xo) = 18 


lA 


IA 


2Xop 


lA 


2X20 


and Xp = 0, Xio = 0, Xop 2 0, X20 = 0. 


However, there is one important factor that is not taken into account explicitly 
in this formulation. Specifically, there is nothing in the model that requires all available 
regular time for a product to be fully utilized before any overtime is used for that 
product. In other words, it may be feasible to have xjoọ > 0 even when xig < 3 and 
to have xo > 0 even when x52 < 3. Such solutions would not, however, be acceptable 
to management. (Prohibiting such solutions is the special restriction discussed earlier 
in this section.) 

Now we come to the key property of separable programming. Even though the 
model does not take this factor into account explicitly, the model does take it into 
account implicitly! Despite the model having excess ‘‘feasible’’ solutions that actually 
are unacceptable, any optimal solution for the model is guaranteed to be a legitimate 
one that does not replace any available regular-time work with overtime work. (The 
reasoning here is analogous to that for the Big M method discussed in Sec. 4.6, where 
excess feasible but nonoptimal solutions also were allowed in the model as a matter 
of convenience.) Therefore, the simplex method can be safely applied to this model 
to find the most profitable acceptable product mix. The reason is twofold. First, the 
two decision variables for each product always appear together as a sum, (Xir + Xio) 
or (X22 + X20), in each functional constraint (one in this case) other than the upper 
bound constraints on individual variables. Therefore, it always is possible to convert 
an unacceptable feasible solution to an acceptable one having the same total production 
rates, X; = Xir + Xio and x = XR + X29, merely by replacing overtime production 
by regular-time production as much as possible. Second, overtime production is less 
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profitable than regular-time production (i.e., the slope of each profit curve in Fig. 
14.13 is a monotonically decreasing function of the rate of production), so converting 
jan unacceptable feasible solution to an acceptable one in this way must increase the 
‘total rate of profit Z. Consequently, any feasible solution that uses overtime production 
for a product when regular-time production is still available cannot be optimal with 
respect to the model. 

For example, consider the unacceptable feasible solution xı} = 1, x9 = 1, 
Xr = 1, X99 = 3, which yields a total rate of profit Z = 13. The acceptable way of 
achieving the same total production rates x, = 2 and x, = 4 iS X;p = 2, X%j9 = 0, 
Xə = 3, X20 = 1. This latter solution is still feasible, but it also increases Z by 
(3. — 2)4) + 6 — 12) = 9. 

Similarly, the optimal solution for this model turns out to be xir = 3, X19 = 
1, X22 = 3, X20 = 0, which is an acceptable feasible solution. 

Notice that most. of the functional. constraints in the model are upper bound 
constraints, i.e., constraints that simply specify the maximum value allowed for an 
individual variable. When a computer code is available for the special streamlined 
version of the simplex method for dealing with such constraints (see Secs. 7.5 and 
9.1), it provides a very efficient way of solving even extremely large problems of this 


type. 


Extensions 


Thus far we have focused on the special case of separable programming where the 
only nonlinear function is the objective function f(x). Now consider briefly the general 
case where the constraint functions. g,(x) need not be linear, but are convex and 
separable, so that each g;(x) can be expressed as a sum of functions of individual 
variables, 


n 
g0) = $ g), 
j=l 
where each g;,(x;) is a convex function. Once again, each of these new functions may 


be approximated as closely as desired by a piecewise linear function (if it is not 
already in that form). The one new restriction is that for each variable x; 


(j = 1,2,..., n), all of the piecewise linear approximations of the functions of 
this variable [f;(x;), 81;(%)), - - - » Smj(%;)] must have the same breakpoints so that the 
same new variables (X4, Xj2,-.-, Xin) can be used for all of these piecewise linear 


functions. This formulation leads to a linear. programming model just like the one 
given for the special case except that for each i and j, the x, variables now have 
different coefficients in constraint i [where these coefficients are the corresponding 
slopes of the piecewise linear function approximating g;(x;)]. Because the g,,(x,) are 
required to be convex, essentially the same logic as before implies that the key property 
of separable programming still must hold. [See Prob. 54(b).] 

One drawback of approximating functions by piecewise linear functions as de- 
scribed in this section is that achieving a close approximation requires a large number 
of line segments (variables), whereas such a fine grid for the breakpoints is needed 
only in the immediate neighborhood of an optimal solution. Therefore, more sophis- 
ticated approaches that use a succession of two-segment piecewise linear functions 


have been developed! to obtain successively closer approximations within this im- 
mediate neighborhood. This kind of approach tends to be both faster and more accu- 
rate in closely approximating an optimal solution. 


14.9 Convex Programming 


We already have discussed some special cases of convex programming in Secs. 14.4 
and 14.5 (unconstrained problems), 14.7 (quadratic objective function with linear 
constraints), and 14.8 (separable functions). You also have seen some theory for the 
general case (necessary and sufficient conditions for optimality) in Sec. 14.6. In this 
section, we briefly discuss some of the types of approaches used to solve the general 
convex programming problem [where the objective function f(x) to be maximized is 
concave and the g,(x) constraint functions are convex] and then present one example 
of an algorithm for convex programming. 

There is no single standard algorithm that always is used to solve convex pro- 
gramming problems. Many different algorithms have been developed, each with its 
own advantages and disadvantages. and research continues to be active in this area. 
Roughly speaking, most of these algorithms fall into one of the following three cate- 
gories. 

One category is gradient algorithms, where the gradient search procedure of 
Sec. 14.5 is modified in some way to keep the search path from penetrating any 
constraint boundary. For example, one popular gradient method is the generalized 
reduced gradient (GRG) method.” 

The second category —sequential unconstrained algorithms — includes penalty 
function and barrier function methods. These algorithms convert the original con- 
strained optimization problem into a sequence of unconstrained optimization problems 
whose optimal solutions converge to the optimal solution for the original problem. 
Each of these unconstrained optimization problems can be solved by the gradient 
search procedure of Sec. 14.5. This conversion is accomplished by incorporating the 
constraints into a penalty function (or barrier function) that is subtracted from the 
objective function in order to impose large penalties for violating constraints (or even 
being near constraint boundaries). You will see one example of this category of 
algorithms in the next section. 

A third category—sequential-approximation algorithms—includes linear- 
approximation and quadratic-approximation methods. These algorithms replace the 
nonlinear objective function by a succession of linear or quadratic approximations. 
For linearly constrained optimization problems, these approximations allow repeated 
application of linear or quadratic programming algorithms. This work is accompanied 
by other analysis that yields a sequence of solutions that converges to an optimal 
solution for the original problem. Although these algorithms are particularly suitable 
for linearly constrained optimization problems, some of them also can be extended to 


Meyer, R. R.: ‘‘Two-Segment Separable Programming,’’ Management Science, 25:385-395, 1979. 


? Lasdon, L. S., and A. D. Warren: ‘‘Generalized Reduced Gradient Software for Linearly and Nonlinearly 
Constrained Problems,” in H. G. Greenberg (ed.), Design and Implementation of Optimization Software, 
Sijthoff and Noordhoff, Alphem aan den Rijn, The Netherlands, 1978. 
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problems with nonlinear constraint functions by the use of appropriate linear approx- 
imations. 

As one example of a sequential-approximation algorithm, we present here the 
Frank-Wolfe algorithm! for the case of linearly constrained convex programming 
(so the constraints are Ax = b, x = 0 in matrix form). This procedure is particularly 
straightforward; it combines linear approximations of the objective function (enabling 
us to use the simplex method) with the one-dimensional search procedure of 
Sec. 14.4. 


A Sequential-Linear-Approximation Algorithm (Frank-Wolfe) 


Given a feasible trial solution x’, the linear approximation used for the objective 
function f(x) is the first-order Taylor’s series expansion of f(x) around x = x’, 
namely, 
4 z ð x’) t t r K 
fix) ~ fx’) + > F o, — x') = f(x’) + VaN — x’). 

j=l OX; 
Because f(x’) and Vf(x')x’ have fixed values, they can be dropped to give an equiv- 
alent linear objective function, 


g(x) = VfR’. 


The simplex method (or the graphical procedure if n. = 2) then is applied to the 
resulting linear programming problem to find its optimal solution x,p. Note that the 
linear objective function necessarily increases steadily as one moves along the line 
segment from x’ to X,p (which is on the boundary of the feasible region). However, 
the linear approximation may. not be a particularly close one for x far from x’, so the 
nonlinear objective function may not continue to increase all the way from x’ to X; p- 
Therefore, rather than just accepting x; p as the next trial solution, we choose the point 
that maximizes. the nonlinear objective function along this line segment. This point 
may: be found by conducting the one-dimensional search procedure of Sec. 14.4, 
where the one variable for purposes of this search is the fraction t of the total distance 
from x’ to Xp. This point then becomes the new trial solution for initiating the next 
iteration of the algorithm, as just described. The sequence of trial solutions generated 
by repeated iterations converges to an optimal solution for the original problem, so 
the algorithm stops as soon as the successive trial solutions are close enough together 
to have essentially reached this optimal solution. 


Summary of Frank-Wolfe Algorithm 


Initialization Step: Find a feasible initial trial solution x, e.g., by applying linear 
programming procedures to find an initial basic feasible solution. Set k = 1. 


Iterative Step 


Part 1: Forj = 1,2,...,n, evaluate 
ð ð 
o atx = x@-D and set c= 2f 
Ox; : Ox; 


| Frank, M., and P. Wolfe: ‘‘An Algorithm for Quadratic Programming,” Naval Research Logistics Quar- 
terly, 3:95-110, 1956. Although originally designed for quadratic programming, this algorithm is easily 
adapted to the case of a general concave objective function considered here. 


Part 2: Find an optimal solution x{% to the following linear programming 537 


problem. Nonlinear 
A Programming 
Maximize g(x) = 2 CjXjs 


subject to Ax =b and x= 0. 
Part 3: For the variable t (0 =r = 1), set 
hit) = f(xy)  forx = x®—Y + tax — x), 


Use some procedure such as the one-dimensional search procedure (see Sec. 14.4) to 
maximize h(t) over 0 = t = 1, and set x equal to the corresponding x. Go to the 
stopping rule. 


Stopping Rule: If x*~" and x are sufficiently close, stop and use x (or some 
extrapolation of x, x, .. . , x*~?, x) as your estimate of an optimal solution. 
Otherwise, reset k = k + 1 and return to the iterative step. 


Now let us illustrate this procedure. 


EXAMPLE: Consider the following linearly constrained convex programming 
problem. 


Maximize f(xy = 5x, — x? + 8x, — 2x3, 


subject to 3x, + 2x, = 6 

and x, 20, x 20. 

Note that 
ð ð 
AT ae ee i ee oe 
Ox, 2 


so that the unconstrained maximum, x = (Ž, 2), violates the functional constraint. 
Thus more work is needed to find the constrained maximum. 

Because x = (0, 0) is clearly feasible (and corresponds to the initial basic 
feasible solution for the linear programming constraints), let us choose it as the initial 
trial solution x® for the Frank-Wolfe algorithm. Plugging x, = 0 and x, = 0 into 
the expressions for the partial derivatives gives c} = 5 and c, = 8, so that g(x) = 
5x, + 8x, is the initial linear approximation of the objective function. Graphically, 
solving this linear programming problem (see Fig. 14.14a) yields x{} = (0, 3). For 
part 3 of the iterative step, the points on the line segment between (0, 0) and (0, 3) 
shown in Fig. 14.14a@ are expressed by 


(x1, X2) = (0, 0) + (0, 3) — (0, 0)] forO srs 1 


(0, 3f) fro stsl, 
as shown in the sixth column of Table 14.5. This expression then gives 
A(t) = fO, 37) = 83A — W30? 

= 24t — 187’, 
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Figure 14.14 Illustration of the Frank-Wolfe algorithm. 


so that the value ¢ = ¢* that maximizes A(t) over 0 = t = 1 may be obtained in this 
case by setting 


dhò) 
dt 


so that r* = $. This result yields the next trial solution, 


= 24 — 36t = 0, 


x = (0, 0) + 3[(0, 3) — ©, 0)] 
= (0, 2), 
which completes the first iteration. 


To sketch the calculations that lead to the results in the second row of Table 
14.5, note that x = (0, 2) gives 


5 — 2(0) = 


Cy 


c = 8 — 4(2) = 


For the objective function, g(x) = 5x,, graphically solving the problem over the 
feasible region in Fig. 14.14a gives x} = (2, 0). Therefore, the expression for the 
line segment between x“) and x® (see Fig. 14.14a) is 


= (0, 2) + 42, 0) — 0, 2)] 
= (2t, 2 — 20, 


Table 14.5 Application of Frank-Wolfe Algorithm to Example 


(0, 0) 5 (0, 3) (0, 35 24t — 1817 
(0, 2) 5 (2, 0) (2t, 2 — 2t) 8 + 10t — 1277 






N 
X 





so that A(t) = f(2t, 2 — 2t) 
= §(2r) — (27)? + 82 — 27) — U2 — 21)? 
= 8 + 10¢ — 129°. 
. dh(t) 
Sett = 10 24t = 0 
etting P t 


yields r* = 7. Hence 


x® = (0, 2) + FI(2, 0) — (0, 2) 
= (G4). 

You can see in Fig. 14.146 how the trial solutions keep alternating between two 
trajectories that appear to intersect at approximately the point x = (1, 3). This point 
is, in fact, the optimal solution, as can be verified by applying the KKT conditions 
from Sec. 14.6. 

This example illustrates a common feature of the Frank-Wolfe algorithm, 
namely, that the trial solutions alternate between two (or more) trajectories. When 
they alternate in this way, we can extrapolate ‘the trajectories to their approximate 
point of intersection to estimate an optimal solution. This estimate tends to be better 
than using the last trial solution generated. The reason is that the trial solutions tend 
to converge rather slowly toward an optimal solution, so the last trial solution may 
still be quite far from optimal. 

In conclusion, we should emphasize that the Frank-Wolfe algorithm is just one 
example of sequential-approximation algorithms. Many of these algorithms use quad- 
ratic instead of linear approximations at each iteration because quadratic approxi- 
mations provide a considerably closer fit to the original problem and thus enable the 
sequence of solutions to converge considerably more rapidly toward an optimal so- 
lution than was the case in Fig. 14.140. For this reason, even though sequential-linear- 
approximation methods such as the Frank-Wolfe algorithm are relatively straightfor- 
ward to use, sequential-quadratic-approximation methods’ now are generally preferred 
in actual applications. Popular among these are the so-called quasi-Newton (or variable 
metric) methods, which compute a quadratic approximation to the curvature of a 
nonlinear function without explicitly calculating second (partial) derivatives. (For lin- 
early constrained optimization problems, this nonlinear function is just the objective 
function; whereas with nonlinear constraints, it is the Lagrangian function described 
in Appendix 2.) Some quasi-Newton algorithms don’t even explicitly form and solve 
an approximating quadratic programming problem at each iteration, but instead in- 
corporate some of the basic ingredients of gradient algorithms. 

For further information about the state of the art in convex programming algo- 
rithms, see Selected References 6 and 7. 


14.10 Nonconvex Programming 


The assumptions of convex programming are very convenient ones, because they 
ensure that any local maximum also is a global maximum. Unfortunately, the nonlinear 
programming problems that arise in practice frequently only come fairly close to 


1 For a survey of these methods, see Powell, M. J. D.: ‘‘Variable Metric Methods for Constrained Opti- 
mization,” in Bachem, A., M. Grotschel, and B. Korte (eds.), Mathematical Programming: The State of 
the Art, Springer-Verlag, Berlin, Heidelberg, New York, and Tokyo, 1983, pp. 288-311. 
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satisfying these assumptions, but they have some relatively minor disparities. What 
kind of approach can be used to deal with such nonconvex programming problems? 

A common approach is to apply an algorithmic search procedure that will stop 
when it finds a local maximum and then to restart it a number of times from a variety 
of initial trial solutions in order to find as many distinct local maxima as possible. 
The best of these local maxima is then chosen for implementation. Normally, the 
search procedure is one that has been designed to find a global maximum when all of 
the assumptions of convex programming hold, but it also can operate to find a local 
maximum when they do not. 

One such search procedure that has beer: widely used since its development in 
the 1960s is the sequential unconstrained minimization technique (or SUMT for 
short).! There actually are two main versions of SUMT, one of which is as an exterior- 
point algorithm that deals with. infeasible solutions while using a penalty function to 
force convergence to the feasible region. We shall describe the other version, which 
is as an interior-point algorithm that deals directly with feasible solutions while using 
a barrier function to force staying inside the feasible region. Although SUMT was 
originally presented as a minimization technique, we shall convert it to a maximization 
technique in order to be consistent with the rest of the chapter. Therefore, we continue 
to assume that the problem is in the form given at the beginning of the chapter, and 
that all of the functions are differentiable. 


Sequential Unconstrained Minimization Technique (SUMT) 


As the name implies, SUMT replaces the original problem by a sequence of uncon- 
strained optimization problems whose solutions converge to a solution (local maxi- 
mum) of the original problem. This approach is very attractive because unconstrained 
optimization problems are much easier to solve (see the gradient search procedure in 
Sec. 14.5) than those with constraints. Each of the unconstrained problems in this 
sequence involves choosing a (successively smaller) strictly positive value of a scalar 
r and then solving for x so as to 


Maximize P(x; r) = f(x) — rB). 


B(x) is a barrier function that has the following properties (for x that are feasible 
for the original problem): 


1. B(x) is small when x is far from the boundary of the feasible region, 

2. B(x) is large when x is close to the. boundary of the feasible region, 

3. B(x) — © as the distance from the (nearest). boundary of the feasible 
region — 0. 


Thus, by starting the search procedure with a feasible initial trial solution and then 
attempting to increase P(x; r), B(x) provides a barrier that prevents the search from 
ever crossing (or even reaching) the boundary of the feasible region for the original 
problem. 

The most common choice of B(x) is 


m 


Ree. > b; — m 





1 Fiacco, Anthony V., and Garth P. McCormick: Nonlinear Programming: Sequential Unconstrained Min- 
imization Techniques, Wiley, New York, 1968. 


For feasibie values of x, note that the denominator of each term is proportional to the 
distance of x from the constraint boundary for the corresponding functional or non- 
negativity constraint. Consequently, each term is a boundary repulsion term that has 
all of the preceding three properties with respect to this particular constraint boundary. 
Another attractive feature of this B(x) is that, when all the assumptions of convex 
programming are satisfied, P(x; r) is a concave function. 

Because B(x) keeps the search away from the boundary of the feasible region, 
you probably are asking the very legitimate question: What happens if the desired 
solution lies there? This concern is the reason that SUMT involves solving a sequence 
of these unconstrained optimization problems for successively smaller values of r 
approaching zero (where the final trial solution from each one becomes the initial trial 
solution for the next). For example, each new r might be obtained from the preceding 
one by multiplying by a constant 6 (0 < @< 1), where a typical value is 0 = 0.01. 
As r approaches zero, P(x; r) approaches f(x), so the corresponding local maximum 
of P(x; r) converges to a local maximum of the original problem. Therefore, it is 
necessary to solve only enough unconstrained optimization problems to permit ex- 
trapolating their solutions to this limiting solution. - 

How many are enough to permit this extrapolation? When the original problem 
satisfies the assumptions of convex programming, useful information is available to 
guide us in this decision. In particular, if x is a global maximum of P(x; r), then 


FE) = fe") S f(X) + rB(X), 


where x* is the (unknown) optimal solution for the original problem. Thus, rB(X) is 
the maximum error (in the value of the objective function) that can result by using 
X to approximate x*, and extrapolating beyond X to increase f(x) further decreases 
this error. If an error tolerance is established in advance, then you can stop as soon 
as rB(X) is less than this quantity. 

Unfortunately, no such guarantee for the maximum error can be given for non- 
convex programming problems. However, rB(X) still is likely to exceed the actual 
error when X and x* now are corresponding local maxima of P(x; r) and the original 
problem, respectively. 


Summary of SUMT 


Initialization Step: Identify a feasible initial trial solution x© that is not on the bound- 
ary of the feasible region. Set k = 1 and choose appropriate strictly positive values 
for the initial r and for 0 < 1 (say, r = 1 and 6 = 0.01). 


Iterative Step: Starting from x*~ ?, apply the gradient search procedure described in 
Sec. 14.5 (or some similar method) to find a local maximum, x™, of 


P(x, r) = fa) -r È : + > |. 


mib- BAK) jai x; 





Stopping Rule: If the change from x“~ to x is negligible, stop and use x (or an 
extrapolation of x, x, ... , x@~P, x) as your estimate of a local maximum of 
the original problem. Otherwise, reset k = k + 1 andr = 6r and return to the 
iterative step. 


1 A reasonable criterion for choosing the initial r is one that makes rB(x) about the same order of magnitude 
as f(x) for feasible solutions x that are not particularly close to the boundary. 
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When the assumptions of convex programming are not satisfied, this algorithm 
should be repeated a number of times by starting from a variety of feasible initial trial 
solutions. The best of the local maxima thereby obtained for the original problem 
should be used as the best available approximation of a global maximum. 

Finally, note that SUMT also can be easily extended to deal with equality 
constraints, g(x) = b,. One standard way is as follows. For each equality constraint, 
Zib, = 8:0 mo = 

Vr b; — 8 Ax) 
in the expression for P(x; r) given under Summary of SUMT, and then the same 
procedure is used. The numerator, —[b, — g,(x)]’. imposes a large penalty for de- 
viating substantially from satisfying the equality constraint, and then the denominator 
tremendously increases this penalty as r is decreased to a tiny amount, thereby forcing 
the sequence of trial solutions to converge toward a point that satisfies the constraint. 

SUMT has been widely used because of its simplicity and versatility. However, 
numerical analysts have found that it is relatively prone to numerical instability, so 
considerable caution is advised. For further information on this issue, as well as similar 
analyses for alternative algorithms, see Selected Reference 7. 


EXAMPLE: To illustrate SUMT, consider the following two-variable problem. 
Maximize fx) = xw, 

subject to et by 3 

and x, =0, x, 20. 

Even though g,(x) = x? + x, is convex (because each term is convex), this problem 


is a nonconvex programming problem because f(x) = xx» is not concave (see Ap- 
pendix 1). 

For the initialization step, (x,, x.) = (1, 1) is one obvious feasible solution that 
is not on the boundary of the feasible region, so we can set x = (1, 1). Reasonable 


choices for r and 0 are r = | and 80 = 0.01. 
For the iterative step, 


l I | 
P(x, r) = xX. — 7 ( PAT ) 


: 
3 -— XP -— XX OND 





With r = 1, applying the gradient search procedure starting from (1, 1) to 


Table 14.6 Illustration of 
SUMT 























maximize this expression eventually leads to x‘? = (0.90, 1.36). Resetting r = 0.01 
and restarting the gradient search procedure from (0.90, 1.36) then leads to x® = 
(0.983, 1.933). One more iteration with r = 0.91(0.01) = 0.0001 leads from x‘? to 
x® = (0.998, 1.994). This sequence of points. summarized in Table 14.6, quite 
clearly is converging to (1, 2). Applying the AKT conditions to this solution verifies 
that it does indeed satisfy the necessary condition for optimality. Graphical analysis 
demonstrates that (x,, x) = (1, 2) is, in fact. a global maximum, [See Prob. 60(b).] 

For this problem. there are no local maxima other than (x1. x) = (1, 2),! so 
reapplying SUMT from various feasible initial trial solutions always leads to this same 
solution. 


14.11 Conclusions 


Practical optimization problems frequently involve nonlinear behavior that must be 
taken into account. It is sometimes possible to reformulate these nonlinearities to fit 
into a linear programming format. as can be done for separable programming prob- 
lems. However, it is frequently necessary to use a nonlinear programming formula- 
tion. 

In contrast to the case of the simplex method for linear programming, there is 
no efficient all-purpose algorithm that can be used to solve a// nonlinear programming 
problems. In fact, some of these problems cannot be solved in a very satisfactory 
manner by any method. However, considerable progress has been made for some 
important classes of problems. including quadratic programming, convex program- 
ming, and certain special types of nonconvey programming. A variety of algorithms 
that frequently perform well are available for these cases. Some of these algorithms 
incorporate highly efficient procedures for unconstrained optimization for a portion 
of each iteration, and some use a succession of linear or quadratic approximations to 
the original problem. 

There has been a strong emphasis in recent years on developing high-quality, 
reliable software packages for general use in applying the best of these algorithms on 
mainframe computers. (See Selected References 6 and 7.) For example, several pow- 
erful software packages such as MINOS (discussed in Sec. 4.9) have been developed 
in the Systems Optimization Laboratory at Stanford University. These packages are 
widely used elsewhere for solving many of the types of problems discussed in this 
chapter (as well as linear programming problems). The steady improvements being 
made in both algorithmic techniques and software now are bringing some rather large 
problems into the range of computational feasibility. 

With the current rapid growth in the use and power of personal computers, good 
progress is being made in nonlinear programming software development for micro- 
computers. For example, the GAMS/MINOS package (a combination of two well- 
known mainframe programs) now is available for use on IBM personal computers. 
Another prominent package called GINO (see Selected Reference 8) was developed 
specifically for microcomputers. 

Research in nonlinear programming remains very active. 


' The technical reason is that f(x) is a (strictly) quasiconcave function that shares the property of concave 
functions that a local maximum always is a global maximum. For further mformation. see Avriel, Mordecai, 
W. Erwin Diewert, Siegfried Schaible, and Israel Zang: Generalized Concavity, Plenum, New York, 1985. 
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PROBLEMS 


1. Consider the product mix problem described in Chap. 3, Prob. 2. Suppose that this 
manufacturing firm actually encounters price elasticity in selling the three products. so that the 
profits would be different from those stated in Chap. 3. In particular, suppose that the unit 
costs for producing products 1, 2, and 3 are $25, $10, and $15, respectively, and that the prices 
required (in dollars) in order to be able to sell x,. x5, and x; units are (35 + 100x, '^), 
(15 + 40x; '/4), and (20 + 50x; '^), respectively. 

(a) Formulate a linearly constrained optimization model for the problem of determining 

how many units of each product the firm should produce to maximize profit. 

(b) Verify that this problem is a convex programming problem. 

(c) Starting from the initial trial solution (x,, x). x3) = (10, 10, 10), apply two itera- 

tions of the Frank-Wolfe algorithm. 


2. For the P & T Co. problem described in Sec. 7.1. suppose that there is a 10 percent 
discount in the shipping cost for all truckloads beyond the first 40 for each combination of 
cannery and warehouse. Show that this situation leads to a nonconvex programming problem. 


3. Consider the following example of the portfolio selection problem with risky secu- 
rities described in Sec. 14.1. Just two stocks are being considered for inclusion in the portfolio. 


The estimated mean and variance of the return on each share of stock 1 are 5 and 4, respectively, 
whereas the corresponding quantities for stock 2 are 10 and 100, respectively. The covariance 
of the return on one share each of the two stocks is 5. The price per share is 20 for stock 1 
and 30 for stock 2, where the total amount budgeted for the portfolio is 50. 

(a) Without assigning a specific numerical value to £, formulate the quadratic pro- 
gramming model for this problem. 

(b) Verify that the model in part (a) is a convex programming problem by using the test 
in Appendix 1 to show that the objective function is concave. 

(c) Starting from the initial trial solution (x,, x.) = (0, 0), apply five iterations of the 
Frank-Wolfe algorithm to this problem with 6 = 0.1. 

(d) Repeat part (c) for B = 0.02 (low aversion to risk) and B = 0.5 (high aversion to 
risk). Also solve by inspection the extreme cases of B = O (no aversion to risk) 
and B = © (complete aversion to risk), where the latter case means to minimize 
V(x). Characterize the solutions for the five values of 8 in terms of the relative 
concentration on the conservative investment (stock 1) or the risky investment 
(stock 2). 


4. Consider the variation of the Wyndor Glass Co. example represented in Fig. 14.5, 
where the second and third functional constraints of the original problem (see Sec. 3.1) have 
been replaced by 9x7 + 5x3 = 216. 

(a) Demonstrate that (x,, x.) = (2, 6) with Z = 36 is indeed optimal by showing that 
the objective function line, 36 = 3x, + 5x, is tangent to this constraint boundary 
at (2, 6). (Hint: Express x, in terms of x, on this boundary, and then differentiate 
this expression with respect to x, to find the slope of the boundary.) 

(b) Starting from the initial trial solution (x,, x.) = (2, 3), apply SUMT to this problem, 
with r = 1, 1077, 1074, 


5. Consider the variation of the Wyndor Glass Co. problem represented in Fig. 14.6, 
where the original objective function (see Sec. 3.1) has been replaced by Z = 126x, — 
9x? + 182x, — 13x3. Demonstrate that (x,, x.) = ($, 5) with Z = 857 is indeed optimal by 
showing that the ellipse, 857 — 126x, — 9x7 + 182x, — 13x3, is tangent to the constraint 
boundary, 3x, + 2x, = 18, at ($, 5). (Hint: Solve for x, in terms of x, for the ellipse, and 
then differentiate this expression with respect to x, to find the slope of the ellipse.) 


6. Consider the following function. 
f(x) = 48x — 60x? + x’. 


(a) Use the first and second derivatives to find the local maxima and local minima of 
f). 

(b) Use the first and second derivatives to show that f(x) has neither a global maximum 
nor a global minimum because it is unbounded in both directions. 


7. For each of the following functions, show whether it is convex, concave, or neither. 
(a) fœ) = 10x — x. 

(b) f(x) = xt + 6x? + 12x. 

O fi) = 2x3 ~ 3x. 

(d) f(x) = x* + x’. 

(e) fx) = x3 + x4. 


8.* For each of the following functions, use the test given in Appendix 1 to determine 
whether it is convex, concave, or neither. 

(a) f(x) = xx. — XT — x} 

(b) f(x) = 3x, + 2x? + 4x, + x5 — Qxym. 

(c) f(x) = x7 + 3xyx. + 2x3. 

(d) f(x) = 20x, + 10x. 

(e) f(x) = xx. 
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9.. Consider the following function. 

f) = 5x, + 2x3 + x} — 3x3x4 + 4x3 + 2x4 + x3 + Bxgx— + 6x% + 3xgxz + x3. 
Show that f(x) is convex by expressing it as a sum of functions of one or two variables and 
then showing (see Appendix 1) that all of these functions are convex. 

10. Consider the following nonlinear programming problem. 
Maximize f) = x, + x, 
subject to xp +s 
and x, = 0, x, = 0. 


(a) Verify that this is a convex programming problem. 
(b) Solve this problem graphically. 
(c) Use the KKT conditions to verify that the solution you obtained in part (b) is optimal. 


11. Consider the following constrained optimization problem. 
Maximize f(x) = —6x + 3x? — 2x3, 
subject to x20. 


Use just the first and second derivatives of f(x) to derive an optimal solution. 


12. Consider the following nonlinear programming problem. 
Minimize Z= xf + 2x7 + 2x + 4x3, 
subject to 2x, + x, = 10 
x, + 2x, = 10 
and x, 20, x, 20. 


(a) Of the special types of nonlinear programming problems described in Sec. 14.3, to 
which type or types can this particular problem be fitted? Justify your answer. 

(b) What are the KKT conditions for this problem? Use these conditions to determine 
whether (x,, x.) = (0, 10) can be optimal. 

(c) If SUMT were to be applied directly to this problem, what would be the uncon- 
strained function P(x; r) to be minimized at each iteration? 

(d) Setting r = 100 and using (x,. x2) = (5, 5) as the initial trial solution, apply the 
gradient search procedure with e€ = 10 to minimize the function P(x; r) you obtained 
in part (c). 

(e) Now suppose that the problem were changed slightly by replacing the nonnegativity 
constraints by x, = 1, x, = 1. Convert this new problem into an equivalent problem 
that has just two functional constraints. two variables, and two nonnegativity 
constraints. 


13. Consider the expression given in Sec. 14.7 for the KKT conditions for the quadratic 
programming problem. Show that the problem of finding a feasible solution for these conditions 
is a linear complementarity problem, as introduced in Sec. 14.3, by identifying w, z. q, and 
M in terms of the vectors and matrices in Sec. 14.7. 


14. Consider the following geometric programming problem. 
Minimize = f(x) = 2x7 xy! + xp xy? 


subject to 4x)X9 + xix3 s 12 


and x, 20, X% = 0. 


(a) Transform this problem into an equivalerit convex programming problem. 
(b) Use the test given in Appendix 1 to verify that the model formulated in part (a) is 
indeed a convex programming problem. 


15. Consider the following linear fractional programming problem. 


10x, + 20x, + 10 
3x, + 4x, + 20° 





Maximize fx) = 


subject to x, + 3x, = 50 
3x, + 2x, = 80 
and x, 20, x, 20. 


(a) Transform this problem into an equivalent linear programming problem. 

(b) Use the simplex method (the automatic routine in your OR COURSEWARE) to 
solve the model formulated in part (a). What is the resulting optimal solution for 
the original problem? 


_16.* Use the one-dimensional search procedure to solve (approximately) the following 
problem. 


Maximize fx) = x8 + 2x — 2x? — 0.25x4 
Use an error tolerance e = 0.04 and initial bounds x = 0, ¥ = 2.4. 
17. Use the one-dimensional search procedure with an error tolerance € = 0.04 and 
with the following initial bounds to solve (approximately) each of the following problems. 
(a) Maximize f(x) = 6x — x’, with x = 0, ¥ = 4.8. 
(b) Minimize f(x) = 6x + 7x? + 4x? + xt, withe = -4, 4 = 1. 
18. Use the one-dimensional search procedure to solve (approximately) the following 


problem. 
Maximize f(x) = 48x° + 42x? + 3.5x — 16x° — 61x4*— 16.5x7 


Use an error tolerance € = 0.08 and initial bounds x = —1, x = 4. 


19. Use the one-dimensional search procedure to solve (approximately) the following 
problem. 
Maximize fC) = xX + 30x — xf — 2x4 — 3x2 


Use an error tolerance € = 0.07 and find appropriate initial bounds by inspection. 
20. Consider the following convex programming problem. 
Minimize Z = xf + x? — 4x, 
subject to XER 
and x=0. 


(a) Use one simple calculation just to check whether the optimal solution lies in the 
interval 0 = x = 1 or the interval 1 = x = 2. (Do not actually solve for the optimal 
solution in order to determine in which interval it must lie.) Explain your logic. 

(b) Use the one-dimensional search procedure with initial bounds x = 0, % = 2 and 
with an error tolerance e = 0.02 to solve (approximately) this problem. 

(c) Use the KKT conditions to derive the optimal solution. 


21. Consider the problem of maximizing a differentiable function f(x) of a single un- 
constrained variable x. Let x, and Xo, respectively, be a valid lower bound and upper bound 
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on the same global maximum (if one exists). Prove the following general properties of the 
one-dimensional search procedure (as presented in Sec. 14.4) for attempting to solve such a 
problem. 

(a) Given xo, Xo, and € = 0, the sequence of trial solutions selected by the midpoint 
rule must converge to a limiting solution. [Hint: First show that lim, ..(%, — 
x,) = 0, where x,,and x, are the upper and lower. bounds identified at iteration n.] 

(b) If f(x) is concave {so that [df(x)/dx] is a monotone decreasing function of x}, then 
the limiting solution in part (a) must be a global maximum. 

(c) If f(x) is not concave everywhere, but would be concave if its domain were restricted 
to the interval between x, and Xo, then the limiting solution in part (a) must be a 
global maximum. 

(d) If f(x) is not concave even over the interval between x) and Xp, then the limiting 
solution in part (a) need not be a global maximum. (Prove this by graphically 
constructing a counterexample.) 

(e) If [df(x)/dx] < 0 for all x, then no x, exists. If [df(x)/dx] > 0 for all x, then no 
Xo exists. In either case, f(x) does not possess a global maximum. 

(f) If f(x)}is concave and lim [df(x)/dx] < 0, then no xp exists. If f(x) is concave 


xX 


and lim [df(x)/dx] > 0, then no Xp exists. In either case, f(x) does not possess a 


X00 


global maximum. 
22. Consider the following unconstrained optimization problem. 
Maximize f(x) = 2x,x. + x, — x7 — 2x3. 


(a) Starting from the initial trial solution (x,, x2) = (1, 1), apply the gradient search 
procedure with e = 0.25 to obtain an approximate solution. 

(b) Solve the system of linear equations obtained by setting Vf(x) = 0 to obtain the 
exact solution. 

(c) Referring to Fig. 14.11 as a sample for a similar problem, draw the path of trial 
solutions you obtained in part (a). Then show the apparent continuation of this path 
with your best guess for the next three trial solutions [based on the pattern in part 
(a) and in Fig. 14.11]. Also show the exact solution from part (b) toward which 
this sequence of trial solutions is converging. 


. 23. Repeat the three parts of Prob. 22 (except with e = 0.5) for the following uncon- 
strained optimization problem. 


Maximize f(x) = 2x,x. — 2x? — x3. 


24. Starting from the initial trial solution (x4, x.) = (1, 1), do two iterations of the 
gradient search procedure to begin solving the following problem. 


Maximize = f(x) = 4x,x. — 2x? — 3x3. 
Then solve Vf(x) = 0 directly to obtain the exact solution. 


25.* Starting from the initial trial solution (x), x.) = (0, 0), use the gradient search 
procedure with e = 0.3 to obtain an approximate solution for the following problem. 


Maximize f(x) = 8x, — x? — 12x, — 2x3 + 2x,x5. 
Then solve Vf(x) = 0 directly to obtain the exact solution. 


26. Starting from the initial trial solution (x,, x.) = (0, 0), do two iterations of the 
gradient search procedure to begin solving the following problem. 


Maximize f(x) = 6x, + 2x,x. — 2x, — 2x7 — x3. 


Then solve Vf(x) = 0 directly to obtain the exact solution. 


27. Starting from the initial trial solution (x,, x.) = (0, 0), apply two iterations of the 549 
gradient search procedure to the following problem. Nonlinear 


Maximize f(x) = 4x, + 2x, + x} — xf — xx. — x3. Programming 


For each of these iterations, approximately solve for r* by applying two iterations of the one- 
dimensional search procedure with initial bounds t = 0, 7 = 1. 


28. Starting from the initial trial solution (x,, x2, x3) = (1, 1, 1), use the gradient 
search procedure with e = 0.05 to solve (approximately) the following problem. 


Maximize f(x) = 3x,x, + 3x9x3 — xX? — 6x3 — x3 


29.* Starting from the initial trial solution (x;, x.) = (0, 0), use the gradient search 
procedure with s = 1 to solve (approximately) each of the following problems. 

(a) Maximize F(X) = xx + 3x, = xT — x3. 

(b) Minimize f(x) = xx + 2x? + 2x5 - 4x, + 4m. 


30. Consider the following linearly constrained optimization problem. 
Maximize f(x) = In@, + x), 
subject to xX, + 2x, S5 


and x, 20, x, 20, 


where In denotes natural logarithm. 

(a) Verify that this problem is a convex programming problem. 

(b) Use the KKT conditions to derive an optimal solution. 

(c) Use intuitive reasoning to demonstrate that the solution obtained in part (b) is indeed 
optimal. [Hint: Note that In(x, + x,) is a monotone strictly increasing function of 
@ + %).] 

(d) Starting from the initial trial solution (x,, x.) = (1, 1), use one iteration of the 
Frank-Wolfe algorithm to obtain exactly the same solution you found in part (b), 
and then use a second iteration to verify that it is an optimal solution (because it is 
replicated exactly). Explain why exactly the same results would be obtained on these 
two iterations with any other initial trial solution except (0, 0). What complication 
arises with (0, 0)? 


31. Consider the following linearly constrained optimization problem. 
Maximize f(s) = In@, + 1) — x3, 
subject to xy 2x Ss 3 
and x, =0, xX, = 0, 


where In denotes natural logarithm. 
(a) Verify that this problem is a convex programming problem. 
(b) Use the KKT conditions to derive an optimal solution. 
(c) Use intuitive reasoning to demonstrate that the solution obtained in part (b) is indeed 
optimal. ) 

(d) Starting from the initial trial solution (x,, x2) = (0, 0), use one iteration of the 
Frank-Wolfe algorithm to obtain exactly the same solution you found in part (b), 
and then use a second iteration to verify that it is an optimal solution (because it is 
replicated exactly). 


32. Consider the following convex programming problem. 
Maximize f(x) = 10x, — 2x? — x7 + 8x, — x3, 


subject to xX) $x, S2 
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and x, 20, x, = 0. 


(a) Use the KKT conditions to demonstrate that (x,, x.) = (1, 1) is not an optimal l 
solution. 
(b) Use the KKT conditions to derive an optimal solution. 


33.* Consider the nonlinear programming problem given in Chap. 11, Prob. 13. De- 
termine whether (x1, x.) = (1, 2) can be optimal by applying the KKT conditions. 


34. Consider the following convex programming problem. 


Maximize f(x) = 24x, — x] + 10x. — x3, 


subject to x, = 8 
x, S7 
and x, 20, x, = 0. 


(a) Use the KKT conditions for this problem to derive an optimal solution. 

(b) Decompose this problem into two separate constrained optimization problems in- 
volving just x, and just x,, respectively. For each of these two problems, plot the 
objective function over the feasible region in order to demonstrate that the value of 
xı or x, derived in part (a) is indeed optimal. Then prove that this value is optimal 
by using just the first and second derivatives of the objective function and the 
constraints for the respective problems. 


35. Consider the following nonlinear programming problem. 





seed xy 
Maximize x) = i 
ax fŒ) cee 
subject to Xj My SD 
and x20, x, 20. 


(a) Use the KKT conditions to demonstrate that (x,, x5) = (4, 2) is not optimal. 

(b) Derive a solution that does: satisfy the KKT conditions. 

(c) Show that this problem is not a convex programming problem. 

(d) Despite the conclusion in part (c), use intuitive reasoning to show that the solution 
obtained in part (b) is, in fact, optimal. [The theoretical reason is that f(x) is pseudo- 
concave.] : 

(e) Use the fact that this problem is- a linear fractional programming problem to trans- 
form it into an equivalent linear programming problem. Solve the latter problem 
and thereby identify the optimal solution for the original problem. (Hint: Use the 
equality constraint in the linear programming problem to substitute one of the vari- 
ables out of the model, and then solve the model graphically.) 


36.* Use the KKT conditions to. derive an optimal solution for each of the following 
problems. 


(a) Maximize = f(x) = x, + 2x, — x3, 
subject to x +X = 1 
and x, 20, x = 0. 
(b) Maximize f(x) = 20x, + 10x,, 
subject to x + x31 
x, + 2x, = 2 


and x, 20, x, 20. 


37, What are the KKT conditions for nonlinear programming problems of the following 551 


form? Nonlinear 
Minimize f(x), Programming 

subject to & (x) = b;, fori = 1,2,...,m 

and x= 0. 


(Hint: Convert this form into our standard form assumed in this chapter by using the techniques 
presented in Sec. 4.6, and then applying the KKT conditions as given in Sec. 14.6.) 


38. Consider the following nonlinear programming problem. 
Minimize Z = 2x? + x3, 
subject to x, + x, = 10 
and x, = 0, x, = 0. 


(a) Of the special types of nonlinear programming problems described in Sec. 14.3, to 
which type or types can this particular problem be fitted? Justify your answer. (Hint: 
First convert this problem to an equivalent nonlinear programming problem that fits 
the form given in the second paragraph of the chapter, with m = 2 andn = 2.) 

(b) Obtain the KKT conditions for this problem. 

(c) Use the KKT conditions to derive an optimal solution. 


39. Consider the following linearly constrained programming problem. 
Minimize f(x) = x? + 4x3 + 16x3, 
subject to xX, +X $x, = 5 
and x, 21, KS k, x32 1. 


(a) Convert this problem to an equivalent nonlinear programming problem that fits the 
form given at the beginning of the chapter (second paragraph), with m = 2 and 
n= 3, 

(b) Use the form obtained in part (a) to construct the KKT conditions for this problem. 

(c) Use the KKT conditions to check whether (x,, x2, x3) = (2, 1, 2) is optimal. 

(d) Starting from the initial trial solution (x,, x5, x3) = (1, 2, 2), apply two iterations 
of the Frank-Wolfe algorithm. 


40. Consider the following linearly constrained convex programming problem. 
Minimize Z = x? — 6x, + x3 — 3x, 
subject to xX, + xX, = 1 
and x, 20, x, = 0. 


(a) Obtain the KKT conditions for this problem. 

(b) Use the KKT conditions to check whether (x,, x») = (2, 4) is an optimal solution. 

(c) Use the KKT conditions to derive an optimal solution. 

(d) Starting from the initial trial solution (x,, x.) = (0, 0), use one iteration of the 
Frank-Wolfe algorithm to obtain exactly the same solution you found in part (c), 
and then use a second iteration to verify that it is an optimal solution (because it is 
replicated exactly). Explain why exactly the same results would be obtained on these 
two iterations with any other trial solution as well. 


41. Consider the following linearly constrained convex programming problem. 


Maximize = f(x) = 8x, — xi + 2x. + x3 


552 subject to xy t -3x +-2x3 = 12 


Mathematical and x, 20, x, = 0, x; = 0. 


Programming 
(a) Use the KKT conditions to demonstrate that (x,, x2; x3) = (2, 2, 2) is not an optimal 


solution. 

(b) Use the KKT conditions to derive an optimal solution. (Hint: Do some preliminary 
intuitive analysis to determine the most promising case regarding which variables 
are nonzero and which are zero.) 

(c) Starting from the initial trial solution (x,, X2, x3) = (0, 0, 0), apply three iterations 
of the Frank-Wolfe algorithm. 


42. Use the KKT conditions to determine whether (x,, Xz, x3) = (1, 1, 1) can be optimal 
for the following problem. 


Minimize Z = 2x, + x3 + x3, 
subject to x? + 2x3 + x$24 
and x, = 0, x, =0, x,20. 


43. Consider the quadratic programming example presented in Sec. 14.7. 

(a) Use the test given in Appendix 1 to show that the objective function is strictly 
concave. 

(b) Verify that the objective function is strictly concave by demonstrating that Q is a 
positive definite matrix; that is, x'Qx > 0 for all x # 0. (Hint: Reduce x"Qx to a 
sum of squares.) 

(c) Show that x, = 12, x, = 9, and u, = 3 satisfy the KKT conditions when they are 
written in the form given in Sec. 14.6. 

(d) Starting from the initial trial solution (x,, x2) = (5, 5), apply three iterations of the 
Frank-Wolfe algorithm. 


44.* Consider the following quadratic programming problem. 
Maximize f(x) = 8x, ~ x2 + 4x3 — x3, 
subject to xX, $y S22 
and x, 20, xX, = 0. 


(a) Use the KKT conditions to derive an optimal solution. 

(b) Now suppose that this problem is to be solved by the modified simplex method. 
Formulate the linear programming problem that is to be addressed explicitly, and 
then identify the additional complementarity constraint that is enforced automatically 
by the algorithm. 

(c) Apply the modified simplex method to the problem as formulated in part (b). 


45. Consider the following quadratic programming problem. 


Maximize f(x) = 20x, — 20x? + 50x, — 5x3 + 20x,%, 


subject to x, + x= 6 
x, + 4x, = 18 
and x, 20, x, = 0. 


Suppose that this problem is to be solved by the modified simplex method. 

(a) Formulate the linear programming problem that is to be addressed explicitly, and 
then identify the additional complementarity constraint that is enforced automatically 
by the algorithm. 

(b) Apply the modified simplex method to the problem as formulated in part (a). 


46. Consider the following quadratic programming problem. 
Maximize f(x) = 2x, + 3x, — x? — x3, 
subject to xX, tx, =2 
and x, 20. xX% = 0. 


(a) Starting from the initial trial solution (x,, x.) = (0, 0), use the Frank-Wolfe algo- 
rithm (six iterations) to solve the problem (approximately). 

(b) Show graphically how the sequence of trial solutions obtained in part (a) can be 
extrapolated to obtain a closer approximation of an optimal solution. What is your 
resulting estimate of this solution? 

(c) Use the KKT conditions to derive an optimal solution directly. 

(d) Now suppose that this problem is to be solved by the modified simplex method. 
Formulate the linear programming problem that is to be addressed explicitly, and 
then identify the additional complementarity constraint that is enforced automatically 
by the algorithm. 

(e) Without applying the modified simplex method, show that the solution derived in 
part (c) is indeed optimal (Z = 0) for the equivalent problem formulated in part 
(d). 

(f) Apply the modified simplex method to the problem as formulated in part (d). 


47. Repeat parts (a) (three iterations only), (c), (d), and (e) of Prob. 46 for the first 
quadratic programming variation of the Wyndor Glass Co. problem presented in Sec. 14.2 (see 
Fig. 14.6). That is: 


Maximize = 126x, — 9x? + 182x, — 13x3, 
subject to the linear constraints given in Sec. 3.1. 
48. Consider the following quadratic programming problem. 
Minimize fœ x) = (& — 1)? + @ = 2)? ~ 3, + w), 


subject to 4x, + x, = 20 
x, + 4x, = 20 
and x, = 0, xX, 20. 


(a) Obtain the KKT conditions for this problem in the form given in Sec. 14.6. (Hint: 
These conditions assume that the objective function is to be maximized.) 

(b) You are given the information that the optimal solution does not lie on the boundary 
of the feasible region. Use this information to derive the optimal solution from the 
KKT conditions. 

(c) Now suppose that this problem is to be solved by the modified simplex method. 
Formulate the linear programming problem that is to be addressed explicitly, and 
then identify the additional complementarity constraint that is enforced automatically 
by the algorithm. 

(d) Apply the modified simplex method to the problem as formulated in part (c). 

(e) Use the separable programming formulation presented in Sec. 14.8 to formulate an 
approximate linear programming model for this problem. Use x,, x, = 0, 2.5, 5 as 
the breakpoints of the piecewise linear functions. 

(f) Use the simplex method (the automatic routine in your OR COURSEWARE) to 
solve the model formulated in part (e). Then reexpress this solution in terms of the 
original variables of the problem. 


49. A certain corporation is planning to produce and market three different products. 
Let x,, x,, and x, denote the number of units of the three respective products to be produced. 
The preliminary estimates of their potential profitability are as follows. 
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For the first 15 units produced of product. 1. the unit profit would be approximately $36. 
The unit profit would only be $3 for any additional units of product 1. For the first 20 units 
produced of product 2, the unit profit is estimated at $24. The unit profit would be $12 for 
each of the next 20 units and $9 for any additional units. For the first 10 units of product 3, 
the unit profit would be $45. The unit profit would be $30 for each of the next 5 units and $18 
for any additional units. 

Certain limitations on the use of needed resources impose the following constraints on 
the production of the three products: 


X, + xX, + xs 60 
3x, + 2x, = 200 
xX + 2x, 70. 


Management wants to know what values of x,, x». and x, should be chosen to maximize total 
profit. 
(a) Use the separable programming technique presented in Sec. 14.8 to formulate a 
linear programming model for this. problem. 
(b) Now suppose that there is an additional constraint that the profit from products 1 
and 2 must total at least $900. Use the technique presented in the Extensions sub- 
section of Sec. 14.8 to add this constraint to the model formulated in part (a). 


50.* Consider the following convex programming problem. 
Maximize f(x) = 4x, + 6x, — x} — 2x3, 
subject to x, +3x,= 8 
5x, + 2x, = 14 
and x, =0, x, 20. 


(a) Verify that (x), x.) = G 5 is an optimal solution by applying the KKT condi- 


tions. 

(b) Use the separable programming technique presented in Sec. 14.8 to formulate an 
approximate linear programming model for this problem. Use x,, x, = 0, 1.5, 3 as 
the breakpoints of the piecewise linear functions. 

(c) Use the simplex. method (the automatic routine in your OR COURSEWARE) to 
solve the approximate model formulated in part (b). Verify that the optimal solution 
satisfies the special restriction for the model. Compare this solution with the exact 
optimal solution for the original problem [see part (a)]. 


51. Reconsider the production scheduling problem of the Build-Em-Fast Company de- 
scribed in Prob. 21 of Chap. 7. The special restriction for such a situation is that overtime 
should not be used in any particular period unless regular time in that period is completely used 
up. Explain why the logic of separable programming implies that this restriction will be satisfied 
automatically by any optimal solution for the transportation problem formulation of the problem. 


52. Consider the following linearly constrained convex programming problem. 
Maximize — f(x) = 32x, + 50x, — 10x} + x3 — xf — xf, 
subject to 3x, t+ x, S11 


2x, + 5x. = 16 


and x, 20, x, 20. 


(a) Use the separable programming technique presented in Sec. 14.8 to formulate an 
approximate linear programming model for this problem. Use x, = 0, 1, 2, 3 and 
x, = 0, 1, 2, 3 as the breakpoints of the piecewise linear functions. 

(b) Use the KKT conditions to determine whether (x,, x.) = (2, 2) can be optimal for 
this original problem (not the approximate model). 

(c) Starting from the initial trial solution (x,, x.) = (0, 0), use the Frank-Wolfe algo- 
rithm (four iterations) to solve the original problem (approximately). 

(d) Ignore the constraints and solve the resulting two one-variable unconstrained opti- 
mization problems. Use calculus to solve the problem involving x, and use the one- 
dimensional. search procedure with e = 0.1 and initial bounds O and 4 to solve the 
problem involving x,. Show that the resulting solution for (x,, x») satisfies all of the 
constraints, so it is actually optimal for the original problem. 


53. Suppose that the separable programming technique has been applied to a certain - 


problem (the ‘‘original problem’’) to convert it into the following equivalent linear programming 
problem. 


Maximize Z = 5x + 4x. + 2x13 + 4x + Xp, 


subject to 3x1, + 3x. + 3x13 + 2x2, + 2X2. S 25 
2X), + 2X4. + 2x43 — X27 — Xa = 10 


and Osx, =2 
Osx, 53 
0= xi 
0S x, =3 
0x, <1. 


What was the mathematical model for the original problem? (You may define the objec- 
tive function either algebraically or graphically, but express the constraints algebraically.) 


54. For each of the following cases, prove that the key property of separable program- 
ming given in Sec. 14.8 must hold. (Hint: Assume that there exists an optimal solution that 
violates this property, and then contradict this assumption by showing that there exists a better 
feasible solution.) 

(a) The special case of separable programming where all of the g,(x) are linear functions. 

(b) The general case of separable programming where all of the functions are nonlinear 

functions of the designated form. [Hint: Think of the functional constraints as con- 
straints on resources, where g,(x;) represents the amount of resource i used by 
running activity j at level x,, and then use what the convexity assumption implies 
about the slopes of the approximating piecewise linear function.] 


55. The MFG Company produces a certain subassembly in each of two separate plants. 
These subassemblies are then brought to a third nearby plant where they are used in the 
production of a certain product. The peak season of demand for this product is approaching, 
so in order to maintain the production rate within a desired range, it is necessary to use 
temporarily some overtime in making the subassemblies. The cost per subassembly on regular 
time (RT) and on overtime (OT) is shown in the following table for both plants, along with 
the maximum number of subassemblies that can be produced on RT and on OT each day. 
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Unit Cost 
RT OT 


$15 $25 
$16 $24 


Capacity 






Plant 1 2,000 1,000 


1,000 500 









Let x, and x, denote the total number of subassemblies produced per day at plants 1 and 
2, respectively. Suppose that the objective is to maximize Z = x, + x, subject to the constraint 
that the total daily cost should not exceed $60,000. Note that the mathematical programming 
formulation of this problem (with x, and x, as decision variables) has the same form as the 
main case of the separable programming model described in Sec. 14.8, except that the separable 
functions appear in a constraint function rather than the objective function. However, if it is 
allowable to use OT even when the RT capacity at that plant is not fully used, the same approach 
can be used to reformulate the problem as a linear programming problem. 
(a) Formulate this linear programming problem. 
(b) Explain why the logic of separable programming also applies here to guarantee that 
an optimal solution for the model formulated in part (a) never uses OT unless the 
RT capacity at that plant has been fully used. 


56. Consider the following nonlinear programming problem (first considered in Chap. 
11, Prob. 19). 


Maximize Z = 5x, + Xo, 
subject to 2x? + x = 13 
xt+tmys 9 
and x, 20, x, 20. 

(a) Show that this problem is a convex programming problem. 

(b) Use the separable programming technique discussed at the end of Sec. 14.8 to 
formulate’ an approximate linear programming model for this problem. Use the 
integers as the breakpoints of the piecewise linear function. 

(c) Use the simplex method (the: automatic routine in your OR COURSEWARE) to 


solve the model formulated in part (b). Then reexpress this solution in terms of the 
original variables of the problem. 


57.. Consider the following linearly constrained convex programming problem. 


Maximize = f(x) = 3x,x. + 40x, + 30x, — 4x} — xf — 3x3 — x3, 


subject to 4x, + 3x, = 12 
xX, + 2x,= 4 
and x, 20, x, 20. 


Starting from the initial trial solution (x,, x.) = (0, 0), apply two iterations of the Frank-Wolfe 
algorithm. 


58.*. Consider the following. linearly constrained convex programming problem. 
Maximize — f(x) = 3x, + 4% — x} — x3, 
subject to xt% Sl 
and x, 20, x = 0. 


(a) Starting from the initial trial solution (x,, x.) = G, 3). apply three iterations of the: 
Frank-Wolfe algorithm. 


(b) Use the KKT conditions to check whether the solution obtained in part (a) is, in 557 
fact, optimal. , Nonlinear 

(c) Starting from the initial trial solution (x,, x2) = (4, 4), apply SUMT. Use the gra- Programming 
dient search procedure to obtain the maximizing solution of P(x; r) at each iteration, 
with r = 1, 1077, 107+. l 


59. Consider the following linearly constrained convex programming problem. 
Maximize f(x) = 4x, — xÍ + 2x% — x3, 
subject to 4x, + 2x, =5 
and x, = 0, x, = 0. 


(a) Starting from the initial trial solution (x,, x) = (3, 3), apply four iterations of the 
Frank-Wolfe algorithm. 

(b) Show graphically how the sequence of trial solutions obtained in part (a) can be 
extrapolated to obtain a closer approximation of an optimal solution. What is your 
resulting estimate of this solution? 

(c) Use the KKT conditions to check whether the solution you obtained in part (b) is, 
in fact, optimal. If not, use these conditions to derive the exact optimal solution. 

(d) Starting from the initial trial solution (x,, x.) = (, 3), apply SUMT. Use the gra- 
dient search procedure to obtain the maximizing solution of P(x; r) at each iteration, 
with r = 1, 1077, 1074, 107°. 


60. Consider the example for applying SUMT given in Sec. 14.10. 

(a) Show that (x,, x.) = (1, 2) satisfies the KKT conditions. 

(b) Display the feasible region graphically, and then plot the locus of points, xx, = 2, 
to demonstrate that (x,, x2) = (1, 2) with f(1, 2) = 2 is, in fact, a global maximum. 


61.* Use SUMT to solve the following convex programming problem. — 
Maximize f(x) = —2x, — @ — 3), 
subject to x, 23 
x, = 3. 
Derive the maximizing solution of P(x; r) analytically, and use r = 1, 107°, 1074, 107°. 


62. Use SUMT to solve the following convex programming problem. 


+ 1) 
Minimize f(x) = ar E 
subject to x,21 
x, = 0. 


Derive the minimizing solution of P(x; r) analytically, and user = 1, 1077, 1074, 1076. 
63. Use SUMT to solve the following convex programming problem. 


Maximize = f(X) = xx — x, — x2 — xX, — x3, 


subject to x, = 0. 


Use the gradient search procedure to obtain the maximizing solution of P(x; r) at each iteration, 
with r = 1, 107°, 1074. Begin with the initial trial solution (x,, x) = (1, 1). 


64. Apply SUMT to Prob. 46. Use the gradient search procedure to obtain the maxi- 
mizing solution of P(x; r) at each iteration, with r = 1, 10~*, 1074. Begin with the initial 
trial solution (x,, x.) = (4, 4). 
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Apply SUMT to Prob. 47. Use the gradient search procedure to obtain the maxi- 


mizing solution of P(x; r) at each iteration, with r = 10°, 1, 1077, 1074. Begin with the initial 
trial solution (x,, x2) = (2, 3). 


66. 


subject to 


and 


(a) 


(b) 


(c) 


67. 


subject to 


and 
(a) 
(b) 
68. 
Prob. 39. 
(a) 
(b) 


69. 


subject to 


and 


(a) 
(b) 


Consider the following nonconvex programming problem. 
Maximize f(x) = 1,000x — 400x? + 40 — xf, 
x? + x = 500 
x20. 


Identify the feasible values for x. Obtain general expressions for the first three 
derivatives of f(x). Use this information to help you draw a rough sketch of f(x) 
over the feasible region for x. Without calculating their values, mark the points on 
your graph that correspond to local maxima and local minima. 

Use the one-dimensional search procedure with e = 0.05 to find each of the local 
maxima. Use your sketch from part (a) to identify appropriate initial bounds for 
each of these searches. Which of the local maxima is a global maximum? 

Use SUMT with r = 10*, 107, 1 (and with s = 25 for the gradient search procedure) 
to find each of the local maxima. Use x = 3 and x = 15 as the initial trial solutions 
for these searches. To find the maximizing solution of P(x; r) each time, use the 
one-dimensional search procedure as described in part (b). Which of the local 
maxima is a global maximum? 


Consider the following nonconvex programming problem. 


xi + 2x35 4 


2x, - #53 


xx + xix, = 2 
x, 2 0, X, = 0. 
If SUMT were to be applied to this problem, what would be the unconstrained 
function P(x; r) to be maximized at each iteration? 
Starting from the initial trial solution (x,, x.) = (1, 1), apply SUMT. Use the 
gradient search procedure to obtain the maximizing solution of P(x; r) at each 
iteration, with r = 1, 107°, 1074. 


Reconsider the convex programming problem with an equality constraint given in 


If SUMT were to be applied to this problem, what would be the unconstrained 
function P(x; r) to be minimized at each iteration? 

Starting from the initial trial solution (x,, x5, x3) = (, 3, 2), apply SUMT with 
r = 1077, 1074, 1076, 1078. 


Consider the following nonconvex programming problem. 
Minimize f(x) = sin 3x, + cos 3x, + sin(x, + %), 
xt — 10x,2 -1 
10x, + x3< 100 
x, 20, x, 2 0. 


If SUMT were to be applied to this problem, what would be the unconstrained 
function P(x; r) to be minimized at each iteration? 

Describe how SUMT should be applied to attempt to obtain a global minimum. (Do 
not actually solve.) 
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15.1 Introduction 


In decision-making problems, we often are faced with making decisions based upon 
phenomena that have uncertainty associated with them. This uncertainty is caused by 
inherent variation due to sources of variation that elude control or due to the incon- 
sistency of natural phenomena. Rather than treat this variability qualitatively, we can 
incorporate it into the mathematical model and thus handle it quantitatively. This 
treatment generally can be accomplished if the natural phenomena exhibit some degree 
of regularity, so that their variation can be described by a probability model. We 
assume that the reader has a basic knowledge of probability theory. The ensuing 
sections are concerned with special types of probability models. 
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15.2 Stochastic Processes 


A stochastic process is defined to be simply an indexed collection of random variables 
{X}, where the index t runs through a given set T. Often T is taken to be the set of 
nonnegative integers, and X, represents a measurable characteristic of interest at 
time t. For example, the stochastic process, X,, X3, X3,...., can represent the 
collection of weekly (or monthly) inventory levels of a given product, or it can 
represent the collection of weekly (or monthly) demands for this product. 

There are many stochastic processes that are of interest. A consideration of the 
behavior of a system operating for some period of time often leads to the analysis of 
a stochastic process with the following structure. At particular points of time t labeled 
0,1,... , the system is found in exactly one of a finite number of mutually exclusive 
and exhaustive categories or states labeled 0, 1, . . . , M. The points in time may be 
spaced equally, or their spacing may depend upon the overall behavior of the physical 
system in which the stochastic process is imbedded, e.g., the time between occurrences 
of some phenomenon of interest. Although the states may constitute a qualitative as 
well as a quantitative characterization of the system, no loss of generality is entailed 
by the numerical labels, 0, 1, ..., M, which are used henceforth to denote the 
possible states of the system. Thus the mathematical representation of the physical 
system is that of a stochastic process {X,}, where the random variables are observed 
att = 0, 1, 2, ..., and where each random variable may take on any one of 
the (M + 1) integers 0, 1, ..., M. These integers are a characterization of the 
(M + 1) states of the process. It must be emphasized that each state that the stochastic 
process reaches is given a label that denotes the physical state of the system. It is 
only for notational convenience that this set is labeled 0, 1,..., M. 

As an example, consider the following inventory problem. A camera store stocks 
a particular model camera that can be ordered weekly. Let D,, Da, ... , represent 
the demand for this camera during the first week, second week, . . . , respectively. 
It is assumed that the D, are independent and identically distributed random variables 
having a known probability distribution. Let X) represent the number of cameras on 
hand at the outset, X, the number of cameras on hand at the end of week one, X, the 
number of cameras on hand at the end of week two, and so on. Assume that X) = 
3. On Saturday night the store places an order that is delivered in time for the opening 
of the store on Monday. The store uses the following (s, S) ordering policy': If the 
number of cameras on hand at the end of the week is less.than s = 1 (no cameras in 
stock), the store orders (up to) S$ = 3. Otherwise, the store does not order (if there 
are any cameras in stock, no order is placed). It is assumed that sales are lost when 
demand exceeds the inventory on hand. Thus {X,} fort = 0, 1, . . . , is a stochastic 
process of the form just described. The possible states of the process are the integers 
0, 1, 2, 3 representing the possible number of cameras on hand at the end of the 
week. In fact, the random variables X, are clearly dependent and may be evaluated 
iteratively by the expression 


x, = [mG ~ Dio), 0, ifX,<1 
+1 )max{(X,— Dis), 0}, if X,2 1, 


1 In general, an (s, S) policy is a periodic review policy that calls for ordering up to S units whenever the 
inventory level dips below s (S = s). If the inventory level is s or greater, then no order is placed. These 
policies are discussed in detail in Chap. 18. 
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fort = 0,1,2,... . This example is used for illustrative purposes throughout many 563 
of the following sections. Section 15.3 further defines the type of stochastic process Markov Chains 
considered in this chapter. 


15.3 Markov Chains 





Assumptions regarding the joint distribution of X,, X}, . . . , are necessary to obtain 
- analytical results. One assumption that leads to analytical tractability is that the 
stochastic process is a Markov chain (defined later), which has the following key 
property: A stochastic process {X,} is said to have the Markovian property if 
PIX, = HX = ko Xi = hy, X-i = ha X= p = PK, = IK = 3, 
fort = 0, 1,..., and every sequence i, j, ky, ki o- -p kizio 

This Markovian property can be shown to be equivalent to stating that the 
conditional probability of any future “‘event,’’ given any past “‘event’’ and the present 
state X, = i, is independent of the past event and depends only upon the present state 
of the process. The conditional probabilities P{X,,, = j|X, = i} are called transition 
probabilities. If, for each i and j, 


PIX = AX, = tf = PIX, = jX = i},  forallt = 0, 1, 





then the (one-step) transition probabilities are said to be stationary and are usually 
denoted by p;;. Thus having stationary transition probabilities implies that the transition 
probabilities do not change in time. The existence of stationary (one-step) transition 
probabilities also implies that, for each i, j, and n (n = 0,1,2,...), 


PRan = AX, = i} = PAX, = X= ab, 


for allt = 0, 1, . These conditional probabilities are usually denoted by py? and 
are called n-step aansition probabilities.1 Thus p% i is just the conditional probability 
that the random variable X, starting in state i, will be in state j after exactly n steps 
(time units). 

Because the p%? are conditional probabilities, they must be nonnegative, sna 
since the process must make a transition into some state, they must satisfy the prop- 
erties 





py = 0, for all i and j, and n=0,1,2,... 


M 
SPP =1, forali, and n=0,1,2,.... 


form 





po = n forn = 0,1,2,... 





1 Forn = 0, pẹ’ is just P(X, = j|Xo = i} and hence is 1 when i = j and O when i + j. Forn = 1, pẹ? 
is just the (one-step) transition probability and is denoted by p;;. 
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or, equivalently, 


P PSin 
Pm = 
pii Poa 


It is now possible to define a Markov chain. A stochastic process {X} (t = 0, 
1, . . .) is said to be a finite-state Markov chain! if it has the following properties: 


1. A finite number of states. 

2. The Markovian property. 

3. Stationary transition probabilities. 

4. A set of initial probabilities P{Xọ = i} for all i. 


Returning to the inventory example developed in the preceding section, it is 
easily seen that {X}, where X, is the number of cameras in stock at the end of the rth 
week (before an order is received), is a Markov chain. Now consider how to obtain 
the (one-step) transition probabilities, i.e.; the elements of the (one-step) transition 
matrix 


Poo Poi Poz? Pos 
P= Pio Pu Pr Pn 
P2 Pa, Pz P23 
P30 P31 P32 P33 


assuming that each D, has a Poisson distribution with parameter A = 1. 

To obtain poo it is necessary to evaluate P{X, = 0|X,_, = 0}. If X,_, = 0, 
then X, = max{(3 — D,), 0}. Therefore, if X, = 0, then the demand during the week 
has to be 3 or more. Hence poy = P{D, = 3}. This transition probability is just the 
probability that a Poisson random variable with parameter \ = 1 takes on a value of 
3 or more, which is obtained from Table A5.4 of Appendix 5, so that po, = 0.08. 
Pi = P{X, = 0|X,_, = 1} can be obtained in a similar way. If X,.; = 1, then 
X, = max{(1 — D,), 0}. To have X, = 0, the demand during the week has to be 1 or 
more. Hence pio = P{D, = 1} = 0.632 (from Table A5.4 of Appendix 5). To find 
Pa = P{X, = 1|X,_, = 2}, note that X, = max{(2 — D,), O} if X,_, = 2. Therefore, 
if X, = 1, then the demand during the week has to be exactly 1. Hence Py = 
P{D, = 1} = 0.368 (from Table A5.4 of Appendix 5). The remaining entries are 
obtained in a similar manner, which yield the following (one-step) transition matrix: 


0.080 0.184 0.368 0.368 
0.632 0.368 0 0 
0.264 0.368 0.368 0 
0.080 0.184 0.368 0.368 


P= 


Several other examples of Markov chains follow: Consider the following model 
for the value of a stock. At the end of a given day, the price is recorded. If the stock 


! The definitions of Markovian property and Markov chain are more restrictive in this context than the 
usages of these terms in the literature because the discussion is confined to a discrete time parameter and 
finite-state space. 


has gone up, the probability that it will go up tomorrow is 0.7. If the stock goes 
down, the probability that it will go up tomorrow is only 0.5. This is a Markov chain, 
where state 0 represents the stock going up and state 1 represents the stock going 
down. The transition matrix is given by 


0.7 0.3 
i E | 

Suppose now that the stock market model is changed so that whether or not the 
stock goes up tomorrow depends upon whether or not it increased today and yesterday. 
In particular, if the stock has increased for the past two days, it will increase tomorrow 
with probability 0.9. If the stock increased today but decreased yesterday, then it will 
increase tomorrow with probability 0.6. If the stock decreased today but increased 
yesterday, then it will increase tomorrow with probability 0.5. Finally, if the stock 
decreased for the past two days, then it will increase tomorrow with probability 0.3. 
If we define the state as representing whether the stock goes up or down, the system 
is no longer a Markov chain. However, we can transform the system into a Markov 
chain by defining the states as follows’: 


State 0: The stock increased both today and yesterday. 
State 1: The stock increased today but decreased yesterday. 
State 2: The stock decreased today but increased yesterday. 
State 3: The stock decreased both today and yesterday. 


This leads to a four-state Markov chain with the following transition matrix: 


09 0 01 0 
pa |06 0 04 0 
00.5 0 05 
0 03 0 07 


One row in the matrix will be verified—say, the second. This corresponds to state 1, 
which represents the stock increasing today but decreasing yesterday. The first element 
in the row represents the probability of the stock increasing tomorrow, having in- 
creased today and given that the stock increased today but decreased yesterday. This 
is just the probability of the stock increasing tomorrow given that it increased today 
but decreased yesterday, i.e., 0.6. Similarly, the third element in the row represents 
the probability of the stock decreasing tomorrow, having increased today and given 
that the stock increased today but decreased yesterday. This is just the probability of 
the stock decreasing tomorrow given that it increased today but decreased yesterday, 
i.e., 0.4. The other two elements are 0 because they pertain to events that are con- 
tradictory; namely, they correspond to instances where the stock decreased today. 


Another example is gambling. Suppose that a player has $1, and with each play. 


of the game wins a dollar with probability p or loses a dollar with probability 
1 — p. The game ends when the player either accumulates $3 or goes broke. This 
model is a Markov chain with the states representing the player’s fortune, i.e., 0, $1, 


' This example demonstrates that Markov chains are able to incorporate arbitrary amounts of history, but 
at the cost of significantly increasing the number of states; i.e., to incorporate N periods of history requires 
N” states to model such arbitrary dependency as a Markov chain. 
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$2, or $3, and with the transition matrix given by 


1 0 0 0 
P= l-p 0 p 0| 

0 l-p 0 p 

0 0 0 1 


Note that in both the inventory and gambling examples, the numeric labeling 
of the states that the process reaches coincides with the physical expression of the 
system—i.e., actual inventory levels and the player’s fortune, respectively — whereas 
the numeric labeling of the states in the stock examples represents notational conve- 
nience. 


15.4 Chapman-Kolmogorov Equations 


Section 15.3 introduced the n-step transition probability p{”. This transition probability 
can be useful when the process is in state i, and the probability that the process will 
be in state j after n periods is desired. The Chapman-Kolmogorov equations provide 
a method for computing these n-step transition probabilities: 


pẹ = A ppg, — foralli,j,n, and Osvsn. 


These equations merely point out that in going from state i to state j in n steps, 
the process will be in some state k after exactly v (less than n) steps. Thüs 
Pp » is just the conditional probability that, starting from state i, the process goes 
to state k after v steps and then to state j in n — v steps. Therefore, summing these 
eee probabilities over all possible k must yield pe. The special cases of 

= land wv = n — 1 lead to the expressions 


pẹ = > Pap? 


1 
py Djs 


Ms i 


and pp = 


a 
[i 
© 


for all i, j, n. It then becomes evident that the n-step transition probabilities can be 
obtained from the one-step transition probabilities recursively. This recursive rela- 
tionship is best explained in matrix notation (see Appendix 3). For n = 2, these 
expressions become 


p? = > PirPrj for all i, j. 


Note that the p® are the elements of the matrix P®. However, it must also be noted 
that these elements. 


M 


> PikP yy 
k=0 


are obtained by multiplying the matrix of one-step transition probabilities by itself; 
that is, 
Pos P-P = P’. 


More generally, it follows that the matrix of n-step transition probabilities can be 
obtained from the expression 


p” = P-P.-- P = P” 
= PP pp, 


Thus the n-step transition probability matrix can be obtained by computing the nth 
power of the one-step transition matrix. For values of n that are not too large, the 
n-step transition matrix can be calculated in the manner just described. However, 
when n is large, such computations are often tedious, and furthermore, round-off 
errors may cause inaccuracies. 

Returning to the inventory example, the two-step transition matrix is given by’ 


0.080 0.184 0.368 0.368 || 0.080 0.184 0.368 0.368 
0.632 0.368 0 0 0.632 0.368 0 0 
0.264 0.368 0.368 0 0.264 0.368 0.368 0 
0.080 0.184 0.368 0.368 || 0.080 0.184 0.368 0.368 


0.249 0.286 0.300 0.165 
0.283 0.252 0.233 0.233 
0.351 0.319 0.233 0.097 | 
0.249 0.286 0.300 0.165 


p? = P = 


Thus, given that there is one camera left in stock at the end of a week, the probability 
is 0.283 that there will be no cameras in stock 2 weeks later; that is, pe = 0.283. 
Similarly, given that there are two cameras left in stock at the end of a week, the 
probability is 0.097 that there will be three cameras in stock 2 weeks later; that is, 
pS = 0.097. 

The four-step transition matrix can also be obtained as follows: 


p® = pt = p®. p® 


0.249 0.286 0.300 0.165 || 0.249 0.286 0.300 0.165 
0.283 0.252 0.233 0.233 }} 0.283 0.252 0.233 0.233 
0.351 0.319 0.233 0.097 |] 0.351 0.319 0.233 0.097 
0.249 0.286 0.300 0.165 || 0.249 0.286 0.300 0.165 


0.289 0.286 0.261 0.164 
0.282 0.285 0.268 0.166 
0.284 0.283 0.263 O.171 | 
0.289 0.286 0.261 0.164 


Il 


Thus, given that there is one camera left in stock at the end of a week, the probability 
is 0.282 that there will be no cameras in stock 4 weeks later; that is, pf = 0.282. 
Similarly, given that there are two cameras left in stock at the end of a week, the 
probability is 0.171 that there will be three cameras in stock 4 weeks later; that is, 
pS = 0.171. 


1 Note that round-off errors already appear in the row corresponding to state 1. 
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It was pointed out that the one- or n-step transition probabilities. are conditional 
probabilities; for example, P{X,, = j|X) = i} = p{P. If the unconditional probability 
P{X, = j} is desired, it is necessary to have specified the probability distribution of 
the initial state. Denote this probability distribution by Q, (i), where 


Qx (Ò = PX% = i, fori =0,1,..., M. 
It then follows that 
P{X, = j} = Qy(O)pyP + ODP + +++ + Qx Mp R- 


In the inventory example it was assumed that initially there were 3 units in 
stock; that is, Xọ = 3. Thus 


Qx (0) = Oy) = Qx(2) = 0, 
and Oy,3) = 1. 


Hence the (unconditional) probability that there will be three cameras in stock 2 weeks 
after the inventory system began is 0.165; that is, P{X, = 3} = (Dp$. If, instead, 
it were given that Qy (i) = 4, fori = 0, 1, 2, 3, then 


PAX, = 3} = 4(0.165) + 400.233) + 40.097) + 4(0.165) = 0.165. 


The fact that the same answer is obtained using these two initial probability distri- 
butions is purely coincidental. 


15.5 Classification of States of a Markov Chain 


It is evident that the transition probabilities associated with the states play an important 
role in the study of Markov chains. In order to further describe the properties of 
Markov chains it is necessary to present some concepts and definitions concerning 
these states. 

State j is said to be accessible from state i if p? > 0 for some n = 0. Recall 
that p% is just the conditional probability of being in state j after n steps, starting in 
state i. It is easily shown then that state j is accessible from state i if, and only if, it 
is possible for the system to enter state j starting from state i. In the inventory example, 
pẹ > 0 for all i, j, so that every state is accessible from every other state. Obviously, 
a sufficient condition for all states to be accessible is that there exists a value of n, 
not dependent upon i and j, for which pf? > 0 for all i and j. In the inventory example, 
all states are accessible since p? > 0 for all i and j. In the gambling example, state 
2 is not accessible from state 3. This can be deduced from the context of the game 
(once the player reaches state 3, the player never leaves this state) or by noting that 
the n-step transition matrix for the game, P™, is of the form 


100 0 
kok Ok 


pm = 


3 


* 
* *K k % 
1 


0 0 0 


for all n, where the symbol * represents nonnegative numbers. Note, however, that 
state 3 is accessible from state 2. 


If state j is accessible from state i, and, in addition, state i is accessible from 
state j, then states ¢ and j are said to communicate. In the inventory example, all 
states communicate. In the gambling example, states 2 and 3 do not. In general, 
(1) any state communicates with itself [because p® = P{X, = |X, = 3 = 1); 
(2) if state i communicates with state j, then state 7 communicates with state i; and 
furthermore, (3) if state i communicates with state j, and state j communicates with 
state k, then state i communicates with state k. Properties (1) and (2) follow from the 
definition of states communicating, whereas property (3) follows from the Chapman- 
Kolmogorov equations. 

As a result of these three properties of communication, the state space may be 
partitioned into disjoint classes, with two communicating states said to belong to the 
same class. Thus the states of a Markov chain may consist of one or more disjoint 
classes (a class may consist of a single state). If there is only one class, i.e., all the 
states communicate, the Markov chain is said to be irreducible. In the inventory 
example, the Markov chain is irreducible. In the first stock example, the Markov chain 
is irreducible. The gambling example contains three classes. State 0 forms a class, 
state 3 forms a class, and states 1 and 2 form a class. 

It is often useful to talk about whether or not a process, starting in state 7, will 
ever return to this state. Let f; denote the probability that the process will ever return 
to state i given that it starts in state i. State i is called a recurrent state if f; = 1 
and transient if f; < 1. A special case of a recurrent state is an absorbing state. A 
state i is said to be an absorbing state if the (one-step) transition probability p;; equals 
1 (which results in a value of 1 appearing as an element of the diagonal of the one- 
step transition matrix). 

Determining whether or not a state is recurrent or transient by evaluating f; is 
not generally simple. Hence, it is not always evident whether a state should be classi- 
fied as recurrent or transient. For example, although all states in the inventory example 
are recurrent (as shown later in this section), it is not simple to prove that f;; equals 
1 for all 7. 

In the gambling example, state 0 and state 3 are absorbing states (each belonging 
to a separate class), and hence are recurrent, so foo and f3; must equal 1. However, 
states 1 and 2 are transient, and it is not simple to show that f,, and fp, are less 
than 1. 

It is possible to determine some properties of f,; that may be useful in deter- 
mining its value. If a Markov process is in state i, and i is recurrent, the probability 
is 1 that the process will return to that state. Since the process is a Markov chain, 
this is equivalent to the process beginning once more from state 7, and with probability 
1, it will once again return to that state. Repeating this argument leads to the conclu- 
sion that state i will be entered infinitely often. Hence, a recurrent state has the property 
that the expected number of time periods that the process is in state i is infinite. 

If a Markov process is in state i, and i is transient, then the probability of 
reentering state i is f; (less than 1) and the probability of not reentering state i is 
1 — fy- It is simple to show that the expected number of time periods that the process 
is in state i is finite and given by 





Lege. 
Therefore, it follows that state i is recurrent if, and only if, the expected number of 


time periods that the process is in state i is infinite given that the process started in 
state i. 
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In order to calculate the expected number of time periods that the process is in 
state i given that X, = i, define 


_ ji ifX, =i 
B, = h if X, * i. 
The quantity 


> B,|Xo =a 


n=1 


represents the number of time periods that the process is in state i given that Xp) = i. 
Therefore, its expectation is given by 


E o B |X, = i) = 
n=1 n=] 


| 
Ms 
ps2 

ies) 
ee 
& 

| 


= > PIX, = 1X) = i} 
oo ii 


Thus, it has been shown that a state į is recurrent if, and only if, 


> p? = 


n=1 


This result can be used to show that recurrence is a class property. That is, all 
states in a class are either recurrent or transient. Furthermore, in a finite-state Markov 
chain, not all states can be transient. Otherwise, after a finite amount of time has 
elapsed, every one of the states will never be reentered (since each state is assumed 
to be transient), which is impossible since the process must be in some state after this 
elapsed time. Therefore, all states in an irreducible finite-state Markov chain are 
recurrent. Indeed, one can identify an irreducible finite-state Markov chain (and there- 
fore conclude that all states are recurrent) by showing that all states of the process 
communicate. It has already been pointed out that a sufficient condition for all states 
to be accessible (and therefore communicate with each other) is that there exists a 
value of n, not dependent upon i and j, for which py > 0 for all i and j. Thus, all 
states in the inventory example are recurrent, since p is positive for all i and j. 
Similarly, the first stock example contains only recurrent states, since p; is positive 
for all i and j. By calculating p? for all i and j in the second stock example, it follows 
that all states are recurrent since p? > 0. 

As another example, suppose “that a Markov process has the following transition 
matrix: 


State 
012 3 4 
of: #0 0 0 

get |e 2000 
P=8 2/0 01 0 O}. 
“310 0 420 
411 0000 


It is evident that state 2 is an absorbing state (and hence a recurrent state) because 
once the process enters state 2 (third row of the matrix), it will never leave. States 3 
and 4 are transient states because once the process is in state 3, there is a positive 

probability that it will never return. The probability is 3 that the process will go from 
state 3 to state 2 on the first step. Once the process is in state 2, it remains in state 
2. Once a process leaves state 4, it can never return. States O and 1 are recurrent 
states. As indicated earlier, to show that states 0 and 1 are recurrent it is sufficient to 
show that fog = 1 and fı, = 1, which is not a simple task. Hence, an alternative 
method is desirable. Observe that the n-step transition matrix of the preceding example 
always has the appearance 


x *« 000 
* * 0 0 0 
P™=/0 0 1 0 Of, 
0 0 * * 0 
x * 000 


where the symbol * represents positive numbers. Hence it is intuitively evident that 
once the process is in state 0, it will return to state 0 (possibly passing through state 
1) after some number of steps. A similar argument holds for state 1. 

Another property of Markov chains that is to be considered is the property of 
periodicities. The period of state i is defined to be the integer ¢ (t > 1), such that 
p® = 0 for all values of n other than t, 2t, 3t, . . . , and f is the largest integer with 
this property. In the gambling example, starting in state 1, it is possible for the process 
to enter state 1 only at times 2, 4, . . . , in which case state 1 is said to have period 
2. This is evident by noting that the player can break even (neither be winning or 
losing) only at times 2, 4, ..., or by calculating pf? for all n and noting that 
p% = 0 for n odd. 

If there are two consecutive numbers, s and (s + 1), such that the process can 
be in state 7 at times s and (s + 1), the state is said to have period 1 and is called an 
aperiodic state. 

Just as recurrence is a class property, it can be shown that periodicity is a class 
property. That is, if state i in a class has period ¢, then all states in that class have 
period ¢. In the gambling example, state 2 also has period 2. 

A final property of Markov chains pertains to further classifying recurrent states. 
A recurrent state i is said to be positive recurrent if, starting in state i, the expected 
time for the process to reenter state i is finite. Similarly, a recurrent state i is said to 
be null recurrent if, starting in state i, the expected time for the process to reenter 
state i is infinite. It can be shown that for a finite-state Markov chain all recurrent 
states are positive recurrent states. Positive recurrent states that are aperiodic are called 
ergodic states. 


15.6 First Passage Times 


Section 15.4 dealt with finding n-step transition probabilities [i.e., given that the 
process is in state i, determining the (conditional) probability that the process will be 
in state j after n periods]. It is often desirable to make probability statements about 
the number of transitions made by the process in going from state i to state j for the 
first time. This length of time is called the first passage time in going from state i to 
state j. When j = i, this first passage time is just the number of transitions until the 
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process returns to the initial state i. In this case, the first passage time is called the 
recurrence time for state i. 

To illustrate these definitions, reconsider the inventory example developed in 
the preceding sections. Recall that the initial inventory (Xo) contains three cameras. 
Suppose that it turns out that there are two cameras at the end of the first week (X, 
takes on the value 2), one camera at the end of the second week (X, takes on the 
value 1), no cameras in stock at the end of the third week (X, takes on the value 0), 
three cameras at the end of the fourth week (X, takes on the value 3), and one camera 
at the end of the fifth week (X; takes on the value 1). In this case, the first passage 
time in going from state 3 to state 1 is 2 weeks, the first passage time in going from | 
state 3 to state 0 is 3 weeks, and the recurrence time of state 3 is 4 weeks. 

In general, the first passage times are random variables and hence have proba- 
bility distributions associated with them. These probability distributions depend upon 
the transition probabilities of the process. In particular, let f% denote the probability 
that the first passage time from state i to j is equal to n. It can be shown that these 
probabilities satisfy the following recursive relationships: 


) G= 
fP = p? = pi 


D p 1 
IP = pP dG Pi 


fo = p? a FP re D F Py (a—-2) . - f$ Dpi 


Thus the probability of a first passage time from state i to state j in n steps can be 
computed recursively from the one-step transition probabilities. In the inventory ex- 
ample, the probability distribution of the first passage time in going from state 3 to 
state 0 is obtained as follows: 


f® = 0.080 
@ = (0.249) — (0.080)(0.080) = 0.243. 


For fixed i and j, the f{ are nonnegative numbers such that 
SY Pas]. 
n=1 7 


Unfortunately, this sum may be strictly less than 1, which implies that a process 
initially in state i may never reach state j. When the sum does equal 1, f{ (for n = 
1, 2, . . .) can be considered as a probability distribution for the random variable, 
the first passage time. 

Whereas calculating f{ for all n may be difficult, it is relatively simple to 
obtain the expected first passage time from state i to state j. Denote this. expectation 
by my, which is defined by the expressions 


2°, if > FP <1 
hy = 7 


5 nf®, tÈ f OD: = 


n=1 
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then u; satisfies uniquely the equation 
My = 1+ > Pix Pig 
kj 
When i = j, my is called the expected recurrence time. 
For the inventory example, these equations can be used to compute the expected 
time until the cameras are out of stock, assuming the process is started when three 


cameras are available; i.e., the expected first passage time, p39, can be obtained. 
Since all the states are recurrent, the system of equations leads to the expressions 


Ho = L + P3ihio + P32k20 + P3330, 
Hæ = L + Paibio + Pato + Pr3k30» 


Hio = 1 + Puibio + Piah + Pi3k30> 
or Hao = 1 + 0.184 p49 + 0.368 u20 + 0.368 p59, 


Hao =1+ 0.368 i0 + 0.368 pt, 
Mio = 1+ 0.36849. 


The simultaneous solution to this system of equations is 


1.58 weeks, 


ll 


Hio 
2.51 weeks, 


H20 
Ho = 3.50 weeks, 


so that the expected time until the cameras are out of stock is 3.50 weeks. In making 
these calculations, we also obtain pry) and po- 


15.7 Long-Run Properties of Markov Chains 


Steady-State Probabilities 


In Sec. 15.4 the four-step transition matrix for the inventory example was obtained. 
It will now be instructive to examine the eight-step transition probabilities given by 
the matrix 

0.286 0.285 0.264 0.166 

0.286 0.285 0.264 0.166 

0.286 0.285 0.264 0.166 | 

0.286 0.285 0.264 0.166 


p® = P? = pt-pt = 


Notice the rather remarkable fact that each of the four rows has identical entries. This 
implies that the probability of being in state j after 8 weeks appears to be independent 
of the initial level of inventory. In other words, it appears that there is a limiting 
probability that the system will be in state j after a large number of transitions, and 
this probability is independent of the initial state. An important result related to the 
long-run behavior of finite-state Markov processes follows. 
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For an irreducible ergodic Markov chain, it can be shown that lim py? exists 


n>% 


and is independent of i. Furthermore, 


lim p? = 


n>% 


Ti 


where the 7; s uniquely satisfy the following steady-state equations: 


m; > 9, 
M 
a; = È MP yp forj = 0,1,...,M, 
i=0 
M 
> am, = 1. 
j=0 


The 77,’s are called the steady-state probabilities of the Markov chain and are equal 
to the reciprocal of the expected recurrence time; that is, 


Si ae forj = 0,1,...,M. 

Hij 

The term steady-state probability means that the probability of finding the process in 
a certain state, say j, after a large number of transitions tends to the value 7, inde- 
pendent of the initial probability distribution defined over the states. It is important 
to note that steady-state probability does not imply that the process settles down into 
one state. On the contrary, the process continues to make transitions from state to 
state, and at any step 7 the transition probability from state i to state j is still p. 

The 7s can also be interpreted as stationary probabilities (not to be confused 
with stationary transition probabilities). If the initial absolute probability of being in 
state j is given by 77; (that is, PIX = j= m;) for all j, then the absolute probability 
of finding the process in state j at time n = 1, 2,..., is also given by 7; (that is, 
PAX, = jt = m). 

It should be noted that the steady-state equations consist of (M + 2) equations 
in (M + 1) unknowns. Because it has a unique solution, at least one equation must 
be redundant and can, therefore, be deleted. It cannot be the equation 


M 
2G = 
j=0 


because 7; = 0 for all j will satisfy the other (M+ 1) equations. Furthermore, the 
solutions to the other (M + 1) steady-state equations have a unique solution up to a 
multiplicative constant, and it is the final equation that forces the solution to be a 
probability distribution. 

Returning to the inventory example, the steady-state equations can fe expressed 
as 


To = MPoo + MPio + TP + T3P30 
mi = ToPo + Py, + TPa + T3P31 
M = ToPo E TPiz + MPa + W3P32, 


m3 = MPo3 + ™P13 + W2P23 + T3P33 


l=m tm tm + 73. 


Substituting values for p; into these equations leads to the equations 575 
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m, = (0.184) + (0.368)a, + (0.368), + (0.184) a3, 


ma = (0.368) 27 + (0.368)7, + (0.368)73, 
m = (0.368) 7 + (0.368) 75, 
1= To + m + Wy + T3. 


Solving the last four equations provides the simultaneous solutions 


To = 0.285, 
m, = 0.285, 
m, = 0.264, 
73 = 0.166, 


which are essentially the results that appear in the matrix P®. Thus after many weeks 
the probability of finding zero, one, two, and three cameras in stock tends to 0.285, 
0.285, 0.264, and 0.166, respectively. The corresponding expected recurrence times 
are 


1 

Koo = — = 3.51 weeks, 
To 
l 

Hi = — = 3.51 weeks, 
mi 
1 

Ha = — = 3.79 weeks, 
Ma. 
1 

H33 = — = 6.02 weeks. 
T3 


There are other important results concerning steady-state probabilities. In par- 
ticular, if i and j are recurrent states belonging to different classes, then 


PY = 0, for all n. 


This result follows from the definition of a class. 
Similarly, if j is a transient state, then 


lim pP = 0, for all i. 


n 


This result implies that the probability of finding the process in a transient state after 
a large number of transitions tends to zero. 


Expected Average Cost Per Unit Time 


The previous subsection dealt with Markov chains whose states were ergodic (positive 
recurrent and aperiodic). If the requirement that the states be aperiodic is relaxed, 
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then the limit 
lim pf? 
n= 


may not exist. To illustrate this point, consider the two-state transition matrix 


Be al 


If the process starts in state 0 at time 0, it will be in state 0 at times 2, 4, 6, . . . , and 
in state 1 at times 1, 3, 5,... . Thus p% = 1 if n is even and pẹ = 0 if n is odd, 


so that 
lim p&o 
no 


does not exist. However, the following limit always exists: For an irreducible Markov 
chain with positive recurrent states, e.g., a finite-state chain, then 


_ jl 
miej- a 
where the 7r,’s satisfy the steady-state equations presented on p. 574. 

This result is extremely important in computing the long-run average cost per 
unit time associated with a Markov chain. Suppose that a cost (or other penalty 
function) C(X,) is incurred when the process is in state X, at time t, for t = 
0, 1,2, .. . . Note that C(X,) is a random variable that takes on any one of the values 
C(O), C(1), . . . , CUM), and the function C( - ) is independent of t. The expected 
average cost incurred over the first n periods is given by the expression 


A t=] 


Using the result that 


í 


1 n 
lim 4- > PPI = T, 
noo (7 k=1 


it is simple to show that the (long-run) expected average cost per unit time is given 
by 
1 n M 

lim fe E > cœ |} = > TCU). 

n=% A t=1 j=0 
As an example, suppose the camera store finds that a storage charge is being allocated 
for each camera remaining on the shelf at the end of the week. The cost is charged 
as follows: If X, = 0, then C(0) = 0. If X, = 1, then C(1) = 2. If X, = 2, then 
C(2) = 8. Finally, if X, = 3, then C(3) = 18. The long-run expected average holding 
cost per week can.then be obtained from the preceding equation; that is, 


lim z : D ca | = 0(0.285) + 2(0.285) + 8(0.264) + 18(0.166) = 5.67. 
r=] 


no 


It should be noted that an alternative measure to the (long-run) expected average 
cost per unit time is the (long-run) actual average cost per unit time. It can be shown 


that this latter measure is given by 
M 


1 n 
lim {2 > cap) => rC) 
n>% (M p=] j=0 


for almost all paths of the process. Thus either measure leads to the same result. 
These results can also be used to interpret the meaning of the 7’s. To interpret them, 
let 


_ fi, ifX, =f 
nO {6 if X, ¥j. 


The (long-run) expected fraction of times the system is in state j is then given by 
1 n 
lim fz |: > cœ | = lim {E[fraction of times system is in state j]} = 77). 
no n t=1 n->o 


Similarly, 7; can also be interpreted as the (long-run) actual fraction of times that the 
system is in state j. 


Expected Average Cost Per Unit Time for Complex Cost Functions 


In the preceding subsection, the cost function was based solely on the state that the 
process is in at time ¢. In many important problems encountered in practice, the cost 
may depend upon another random variable as well as upon the state that the process 
is in. For example, in the inventory example developed in this chapter, suppose that 
the costs to be considered are the ordering cost and the penalty cost for unsatisfied 
demand (storage costs will be ignored). It is reasonable to assume that the number of 
cameras ordered depends only upon the state of the process (the number of cameras 
in stock) when the order is placed. The cost for unsatisfied demand may be assumed 
to depend upon the demand during the week as well as upon the state of the process 
at the beginning of the week. The charges for period t will be made at the end of the 
week and will include the cost of the order delivered on the Monday of that week and 
the cost of unsatisfied demand during the week. Thus the cost incurred for period t 
can be described as a function of X,_, and D, that is, C(X,_,, D,). Note that the 
demands D,, D,,,, . . . , during successive weeks are assumed to be independent and 
identically distributed random variables. Furthermore, recall that the (s, S) policy 
(1, 3) is being used. X,_,, the stock level at the end of period t — 1 (before ordering), 
is defined iteratively by the expression given in Sec. 15.2. Thus it follows that 
(Xo, Xi; X2,- - -3 X) and D, are independent random variables because 
Xo, Xis X2, ... , X,—ı are functions only of Xp, D,,...,D,~,, which are inde- 
pendent of D,. Under these conditions, it can be shown that the (long-run) expected 
average cost per unit time is given by 


n M 
lim fe E X CE v|} = >) KT, 
A p=] j=0 


n—>%0 


where k(j) = E(C, D»I, 


and this latter (conditional) expectation is taken with respect to the probability distri- 
bution of the random variable D, (given the state). Similarly, the (long-run) actual 
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average cost per unit time is given by 


n M 
lim [: X OX,- a} = > kT, 
n=>æ (7 t= j=0 

Suppose that the following costs are associated with the (s, S) inventory policy 
given earlier (storage charges are now neglected). If z > 0 cameras are ordered, the 
cost incurred is 10 + 25z dollars. If no cameras are ordered, no ordering cost is 
incurred. For each unit of unsatisfied demand (lost sales), there is a penalty of $50 
per unit. If the (s = 1, S = 3) ordering policy is followed, then the cost in week f 
is given by C(X,_,, D), where 


cæ, p) = {10 + 96) + 50 max{(D, - 3). 0}. if X, <1 
7b e150 max{(D, — X,_,), 0} if X,.,21 
fort = 1,2,.... Hence 


C(O, D,) = 85 + 50 max{(D, — 3), 0}, 
EICO, D)] = 85 + 50E[max{(D, — 3), O} 


so that k(0) 


iI 


85 + 50[1P (4) + 2Pp(5) + 3Pp(6) +°>], 


where Pp(i) is the probability that the demand equals i and has been assumed to have 
a Poisson distribution with parameter A = 1. Hence k(0) = 86.2. Similar calculations 
lead to the results 


k1) = EIC, D)] = 50E{max{(D, — 1). O} 
50[1P,(2) + 2P,(3) + 3P,(4) +] 
18.4, 


50E[max{(D, — 2), 0} 
50[1P,(3) + 2P)(4) + 3P,(5) + °° 7] 
5.2, 


50E[max{(D, — 3). O} 
= 50[1Pp(4) + 2P)(5) +- ] 
= 1.2. 


II 


il 


k(2) = EIC., D,)] 


and k(3) = E[CG3, D)] 


Thus the (long-run) expected average inventory cost per week is given by 

3 

> K(j)m; = (86.2)(0.285) + (18.4)(0.285) + (5.2)(0.264) + (1.2)(0.166) = 31.4. 
j=0 


This cost is the cost associated with the (s, S) policy; (s, S) = (1, 3). The cost of 
other (s, S) policies can be evaluated in a similar way to identify the policy that 
minimizes the expected average inventory cost per week. 

The results of this section were presented only in terms of the inventory example. 
However, the (nonnumerical) results still hold for other problems as long as the 
following conditions are satisfied: 


1. {X,} is an irreducible Markov chain whose states are positive recurrent. 

2. Associated with this Markov chain is a sequence of random variables {D,} 
each of which is independent and identically distributed. 

3. Fora fixed m = 0, +1, £2,..., a cost C(X,, D,+m) is incurred at time 
t, fort = 0,1,2,.... 

4. The sequence (Xp, Xi, Xz, . . . , X,) must be independent of D,, ,». 


In particular, if these conditions are satisfied then 579 
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lim fz F 5 C(X,, Dram) = >, Kim 
t=1 y= 


n=% 


where k) = ELCO, Dis m)l, 


and this latter conditional expectation is taken with respect to the probability distri- 
bution of the random variable D, (given the state). Furthermore, 


Lg 4 
lim f: + C(X,, Drew) = > kT; 
= 


now (|7 t=1 


for almost ali paths of the process. 


15.8 Absorption States 


It was pointed out that a state k is called an absorbing state if Py. = 1, so that once 
the chain visits k it remains there forever. If k is an absorbing state, the first passage 
probability from i to k is called the probability of absorption into k, having started at 
i, When there are two or more absorbing states in a chain, and when it is evident that 
the process will be absorbed into one of these states, it is desirable to find these 
probabilities of absorption. These probabilities can be obtained by solving a system 
of linear equations. Suppose that the Markov chain is such that ultimately one of the 
absorbing states will be reached. If the state k is an absorbing state, then the set of 
absorption probabilities f; satisfies the system of equations 


M 
fa = 2, Pafi foi = 0,1,..., M, 
subject to the conditions 
fa = 1, 
fa = 9, if state i is recurrent and i # k. 


Absorption probabilities are important in ‘‘random walks.” A random walk is 
a Markov chain with the property that if the system is in a state i, then in a single 
transition the system either remains at i or moves to one of the states immediately 
adjacent to i. For example, a random walk often is used as a model for situations 
involving gambling. To illustrate, consider a gambling example similar to that pre- 
sented in Sec. 15.3. However, suppose now that two players, each having $2, agree 
to keep playing a game and betting $1 at a time until one is broke. The amount of 
money that player A has forms a Markov chain with the states representing player A’s 
fortune, i.e., 0, 1, 2, 3, 4, and with transition matrix 


1 0 0 0 0 
l-p 0 p 0 0 
P=|0 l1-p 0 p 0 
0 0 1-p 0 p 
0 0 0 0 1 


If p represents the probability of A winning a single encounter, then the prob- 
ability of absorption into state 0 (A losing all his money) can be obtained from the 
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preceding system of equations. It can be shown that these equations. then result in the 
alternate expressions (for general M rather than M = 4 as in this example), 


Bido p” 
1- fo = g, fori=1,2,...,M, 

EN, 

1- ø 1 
=op for p * 5 

l 1 
= fi = 

M or p A 


where p = (1 — p)/p. 
For M = 4,i = 2, and p = 3, the probability of A going broke is given by 


1- P 1 
fa m1- E5] -i 


and the probability of A winning $4 (B going broke) is given by 


4 


= ] — = ae 
Fog fao 5 





There are many other situations where absorbing states play an important role. 
Consider a department store that classifies the balance of a customer’s bill as fully 
paid (state 0), 1-30 days in arrears (state 1), 31-60 days in arrears (state 2), or bad 
debt (state 3). The accounts are checked monthly, and the state of each customer is 
determined. In general, credit is not extended and customers are expected to pay their 
bills within 30 days. Occasionally, customers pay only portions of their bill. If this 
occurs when the balance is within 30 days in arrears (state 1), the store views the 
customer as remaining in state 1. If this occurs when the balance is between 31 and 
60 days in arrears, the store views the customer as moving to state 1 (1-30 days in 
arrears). Customers that are more than 60 days in arrears are put into the bad debt 
category (state 3), and then bills are sent to a collection agency. After examining data 
over the past several years, the store has developed the following transition matrix': 






























1: 1-30 Days | 2: 31-60 Days 
State in Arrears in Arrears 3: Bad Debt 
0: fully paid 0 0 0 
1: 1-30 days 0.7 0.2 0.1 0 
in arrears 
2: 31-60 days 0.5 0.1 0.2 0.2 
in arrears 
: bad debt 0 0 1 




















Although each customer ends up in state 0 or 3, the store is interested in determining 
the probability that a customer will end up as a bad debt given that the account belongs 


1 Customers that are fully paid (in state 0) may be viewed as ‘‘new’’ customers if they return for repeat 
purchases. 


to the 1-30 days in arrears state, and similarly, given that the account belongs to the 581 
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In order to obtain this information, the set of equations presented at the beginning 

of this section must be solved. In particular, f,, and f,, are to be obtained. Substi- 

tuting, the following two equations are obtained: 


fis = Piofos + Pufis + Profs + Pisfss 
fas = Poofos + Paifis + Pafz + Posfs3- 


Noting that fọ = 0 and f3, = 1, there are now two equations in two unknowns, 
namely, 


A = pidfis = Pi + Puf 


(1 = Pa)fz = P3 + Prifis- 


Substituting the values from the transition matrix leads to 


0.8f13 = 0.1fz3, 


0.8fa = 0.2 + O.1f;3, 
and the solution is 
fig = 0.032, 
fas = 0.254. 


Thus, approximately 3 percent of the customers whose accounts are 1-30 days in 
arrears end up as bad debts, where 25 percent of the customers whose accounts are 
31-60 days in arrears end up as bad debts. 


15.9 Continuous Time Markov Chains 


All the previous sections assumed that the time parameter t was discrete (that is, 
t = 0,1, 2, .. .). Such an assumption is suitable for many problems, but there are 
certain cases (such as for some queueing models) where a continuous time parameter 
is required. 

Following the definitions for discrete time Markov chains, let {X()}, where 
t = 0, be a stochastic process taking on values 0, 1, . . . , M. This process is called 
a continuous time Markov chain if the transition probabilities can be expressed as 


P{X(t + s) = j|X(s) = i and X) = x(n} = PIXE + s) = JX) = a, 


for alli, j = 0,1,...,Mand0=,7< s. Furthermore, if these transition proba- 
bilities are independent of s, they can be expressed as 


P,(t) = PXE + s) = jlX(s) = i}, 
and are called stationary transition probabilities. This function is assumed to be 


continuous at tf = 0, with 


i, ifi=j 
lim pita) = te if i <j. 
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Note that a continuous time Markov chain has.the property (similar to that of discrete 
time Markov chains) that the probability of any future ‘‘event’’ occurring, given any 
past ‘‘event’’ and the present state, is independent of the past event and depends only 
upon the present state of the process. Furthermore, if the continuous time Markov 
chain is stationary (as will be assumed herein), the transition probability p,,(¢) is. 
independent of the time at which the present state occurred. 

It is worthwhile exploring the implications of the Markovian property. The 
Markovian property (with stationary transition probabilities) implies that the proba- 
bility of the system being in state j at time ¢ + s, given that the system is in state i 
at time s, that is, 

P{X(t + s) = JX) = i}, 


depends only on i and j and the incremental value of ¢ and is independent of s; that 
is, it can be expressed as p;(t). Similarly, when i = j, the probability of the system 
remaining in state i at time £ + s, given that it is in state i at time s, depends only 
on i and the incremental value of ¢ and is independent of s; that is, it can be expressed 
as p,(t). Now consider the random variable T;, which represents the time required for 
the system to transit out of state i. The system remains in state i at time £ + s, given 
that the system is in state i at time s, that is, X(t + s) = i|X(s) = i, if, and only if, 
the time required for the system to transit out of state i exceeds t + s when the time 
the system has spent in state i exceeds s; that is, T; > t + s|T; > s. Therefore, 


P{X(t + s) = iX) = i} = PIT; >t + sT, > s}. 


Similarly, the system remains in state i at time t given that the system starts in 
state i at time 0, that is, X(t) = i|X(0) = i, if, and only if, the time required for the 
system to transit out of state i exceeds ¢ when the system starts in state i at time 0, 
that is, T, > tT; > 0, or, equivalently, T; > t. Therefore, 


PIXA = IXO) = } = PIT, > 2H. 
The Markovian property (with stationary transition probabilities) implies that 


P{X(t + 8) = iX) = 3 = PXA = XO) = 3, 
so that 
PIT, >t + s|T; > s} = PAT, > th. 


This is a rather unusual condition for a probability distribution to possess. The prob- 
ability distribution of the time required for the system to transit out of a given state 
always is the same, regardless of how much time the system has already spent in that 
state. In effect, the random variable is memoryless; the process forgets its history. 
There is only one (continuous) probability distribution that possesses this property, 
the exponential distribution with, say, mean 1/q (see Sec. 16.4 for a complete dis- 
cussion of this distribution), i.e., 


PIT; <t} = 1-e7% fort = 0. 


This result leads to an equivalent way of defining a continuous time Markov 
chain. A stochastic process {X(4)}, where t = 0 taking on values 0, 1,..., M; isa 
continuous time Markov chain if: 


1. Each time the process enters a state i, the amount of time it spends in that 
state before transiting to a different state has an exponential distribution with 
mean time 1/q;; 


2. When leaving state i, the process moves to a state j, with probabilities p;;, 
where the p; satisfy the conditions 


Pi = O, for all i, 


M 
and 5 Py = 1, for all i; 
j=0 


and 


3. The next state visited after transiting from state i is independent of the time 
spent in state i. 


Just as the one-step transition probabilities played a major role in describing the 
Markov process for a discrete time parameter chain, the analogous role for a contin- 
uous time parameter chain is played by the transition intensities. The transition in- 
tensity is defined by 

d [-= py) 
t 3 


qj = ~ G_P;f0) = lim 


forj = 0,1,...,M, 


d a(t cl 
and qi; = pO) = lim if ; = qiPip for all i ¥ J 
dt t0 t 





(For finite-state continuous Markov chains these limits exist and are finite.) Here q; 
is called the intensity of passage, given that the Markov chain is in state j, and q; is 
called the intensity of transition to state j from the state i. The quantity p;; has already 
been defined as the probability, when leaving state i, that the process moves to state 
j. The intensity of passage, q;, is just the reciprocal of the expected value of the 
exponentially distributed random variable that represents the time required for the 
process to transit out of state i; that is, q, is the rate at which the process leaves state 
i when it is in state i. Similarly, the transition intensity, q;;, can be interpreted as the 
transition rate into state j when the process is in state i. Thus, the process, starting in 
state i, spends an amount of time in that state before transiting to a different state that 
has an exponential distribution with rate q;. The process then moves to state j with 
probability p; = q,,/q; and spends an amount of time in state j that has an exponential 
distribution with rate q;, and so on. Finally, note that the intensity of passage and the 
intensity of transition are related in that 


These resuits suggest still another equivalent way of defining a continuous time 
Markov chain. A stochastic process {X(t)}, where t = 0, taking on values 
0,1,...,M, is a continuous time Markov chain if the process, when in state i, is 
poised to move to one of the M other alternative states, and the times required to 
transit from state i to each possible state j (j = 0,1,...,M andj # i), denoted 
by the random variables T;;, have independent exponential distributions with means 
equal to 1/q;;. From state i, the particular state to which the process ultimately transits, 
denoted by j*, is that state j whose corresponding T; is smallest. Once the process 
enters the new state j*, a new set of independent exponential random variables is 
utilized in a similar fashion to determine the time the chain resides in the state j*, 
and the next state visited from j*. The q; are the transition rates alluded to previously. 
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If T;» is defined as the minimum of the T; (j # i), it is easily shown (see Sec. 16.4) 
that T;» has an exponential distribution with mean time 1/q; (where q; is the intensity 
of passage defined earlier) and, as previously noted,! 
M 
q; = > qij 
j=0,jži 

Just as the discrete time parameter models satisfy the Chapman-Kolmogorov 
equations, the continuous time transition probability function also satisfies these equa- 
tions; i.e., for any states i and j, and positive numbers t and v (0 =v = t), 


M 
pt) = >, Pi(V)pjg(t — v)- 


A pair of states i and j are said to communicate if there are times ¢, and t, such that 
Pyt) > 0 and p,{t,) > 0. All states that communicate are said to form a class. If all 
states in a chain form a single class, i.e., an irreducible chain (which will be assumed), 
then 


p(t) > 0, for allt > 0 and all states i and j. 


Furthermore, lim p;(t) = 7; 
too 
always exists and is independent of the initial state of the chain, for j = 0,1,...,M. 
The 7; satisfy the equations 
M 
T; = > T;p;(t), forj = 0,1,...,M, and for every t = 0. 
jz 


The limiting probabilities also satisfy the equations 


M 
dj = > mq, forj=0,1,...,M, 
i=0,i4j 
M 
and 5 m = 1, 
j=0 


which can be used to obtain the values of these limiting probabilities. 

It has already been noted that q; is the rate at which the process leaves state j 
given that it is in state j so that q;77; is just the rate (in the long run) at which the 
process leaves state j. Similarly, q; is just the transition rate into state j given that the 
process is in state i so that 2;.; 7g, is just the overall rate of transition into state j. 
Since the long-term rate at which a process leaves a state j must equal the long-term 
rate at which a process arrives at state j, in order to maintain a stable system, 


M 
Tq = È Tigy forj =0,1,...,M, 

i=0,i*#j 
which is the equation given above for obtaining limiting probabilities and is sometimes 
called the balance equations. 


1 This useful result is used again in the final illustrative example in the chapter. 


As an example, consider the following repairer problem, which is discussed in 
detail in Chap. 16 (‘Queueing Theory’’). There are two machines serviced by a single 
repairer. A machine that breaks down is serviced immediately unless the repairer is 
servicing another machine, in which case a waiting line is formed. A system is said 
to be in state n if n machines are not working. If the process is in state n, where 
1 <n & 2, this state means that one machine is being serviced and (n — 1) are in 
the waiting line waiting to be repaired. If the process is in state 0, then all machines 
are working, and the repairer is idle. Let X(¢) be the number of machines not working 
at time t. 

Assume that the time to failure of a working machine has an exponential dis- 
tribution with mean 1/A and that the service time for a failed machine has an inde- 
pendent exponential distribution with mean |/. The stochastic process {X(t)} is a 
continuous parameter Markov process with possible states 0, 1, 2. The steady-state 
probabilities of being in these states are of interest. 

Before solving the balance equations, it is important to note some characteristics 
of the q, for this model. A process in, say, state 1 (one machine not working) can 
move only to an adjacent state, that is, zero machines not working or two machines 
not working. Hence, q; = 0 whenever li — j| = 2. The balance equations become 


Tolo = Tdio» 
midi = Togor + 72421, 
mq = + mqz 


Solving these equations in terms of 7 leads to 


m = m £ 
dio 
m= T 40 412 
fio 42 
Noting that mo + 7, + m, = 1 results in 
1 





Be aah tars (do/410) + (o/410)(4 12/42) 


Values for go, q2; qio, and gj. can be obtained from the model as functions of u 
and À. 

In order to find these values, the following general result is utilized (see Sec. 
16.4). If T,, T,,..., T, are k independent random variables having exponential 
distributions with rates v,, U2, ... , U,, respectively, then Z, the minimum of these 
k random variables, also has an exponential distribution but with rate v, + v, + 

- + v Returning to the machine example, gp is the rate at which the process 
leaves state 0 (all machines working) when it is in state 0. Since there are two 
machines, both working, and each breaks down according to an exponential distri- 
bution with rate A, the process leaves state 0 when the first machine breaks down, 
i.e., when the minimum of the two failure times occur. Hence, 


Go =A+AZ=2A. 
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Similarly, q, is the rate at which the process leaves state 2 (both machines are down) 
when it is in state 2. When both machines are down, one is being repaired by the 
serviceperson, so that the process leaves state 2 when the machine is repaired. There- 
fore, the rate at which the process leaves state 2 is given by 


dq = H. 


The rate q} is the transition rate into state 0 (both machines working) when the process 
is in state 1 (one machine down). The rate at which this occurs is the rate at which 
the machine is repaired, i.e., 


dio = H. 


Similarly, q,» is the transition rate into state 2 (both machines down) when the process 
is in state 1 (one machine down). The rate at which this occurs is the rate at which a 
single machine fails, i.e., 

Gi = A. 


Although it is not necessary to calculate g, to get the steady-state probabilities, 
it is instructive to do so. The rate q, is the rate at which the process leaves state 1 
(one machine down and one machine operating) when it is in state 1. The process 
leaves state 1 when either the operating machine goes down or the down machine is 
repaired, i.e., at the minimum of the time until the operating machine fails and the 
time until the down machine is repaired. The rate q, is then given by 


Qa=At+ py. 


Another way of viewing this result for q, is as follows: Suppose there are two 
clocks, one set to give the time for the operating machine to break down, and the 
second set to give the time for the down machine to be repaired. These times have 
independent exponential distributions with respective rates A and u. The clocks are 
set and an alarm sounds as soon as either the operating machine breaks down or the 
down machine is repaired. When the alarm sounds, the process leaves state 1. The 
rate at which the process leaves state 1, q4, then is as shown above. 

Using the derived values for qo, q2, qio, and qiz results in 





1 
To = TF Ohi BOO Ne 
2 
Ti = To y 
u 


i 2 
Ta = Tg 2 (4) ; 
H 
The results for M machines are given in Sec. 16.6 (basic model with a limited source). 
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PROBLEMS 


1. Assume that the probability of rain tomorrow is 0.5 if it is raining today, and assume 
that the probability of its being clear tomorrow is 0.9 if it is clear today. 

(a) Determine the one-step transition matrix of the Markov chain. 

(b) Find the two-step transition matrix. 

(c) Find the steady-state probabilities. 


2. Assume that the probability of rain tomorrow is a if it is raining today, and assume 
that the probability of its being clear tomorrow is £ if it is clear today. 

(a) Determine the one-step transition matrix of the Markov chain. 

(b) Find the two-step transition matrix. 

(c) Find the steady-state probabilities. 


3. Consider the stock market model presented in Sec. 15.3. Whether or not the stock 
goes up tomorrow depends upon whether or not it increased today and yesterday. If the stock 
has increased for the past two days, it will increase tomorrow with probability œ. If the stock 


increased today but decreased yesterday, it will increase tomorrow with probability a. If the | 


stock decreased today but increased yesterday, -it will increase tomorrow with. prob- 
ability a3. Finally, if the stock decreased for the past two days, it will increase tomorrow with 
probability a,. : 

(a) Determine the one-step transition matrix of the Markov chain. 

(b) Find the steady-state probabilities. 


4. Consider the stock market example given in Prob. 3. Suppose, however, that whether 
or not the stock goes up tomorrow depends upon whether or not it increased today, yesterday, 
and the day before yesterday. Can this problem be formulated as a Markov chain? If so, what 
are the possible states? 


5.* Determine the classes of the Markov chains and whether or not they are recurrent. 


00% 23 1000 
us 0, 30820 _1|0 2 2 0 
A=io i o ot Pe bie 4 ol 

0100 4 004 


6. Determine the classes of the Markov chains and whether or not they are recurrent. 


© 


4 4 14 

1041 0 0 1 
A=|i o 4) B522 0} 

TER J 


7. Determine the classes of the Markov chain and whether or not they are recurrent. 


COMO Oo 
Ah A OO © 
Behe OOO 


O O umawa 
O O wma aw 
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8. Suppose that'a communications network transmits binary: digits; 0 or 1. In passing 
through the network, there is a probability q that the binary digit will be received incorrectly 
at the next stage. If Xo denotes the binary digit entering the system, X, the binary digit recorded 
after the first transmission, X, the binary digit recorded after the second. transmis- 
sion, . . . , then {X,}isa Markov chain. Find the one-step and steady-state transition matrices. 


9. A transition matrix P is said to be doubly stochastic if the sum over each. column 
equals 1; that is, 


IMs 


Py = 1, for all j. 


1l 


i=0 


If such a chain is irreducible, aperiodic, and consists of M + 1 states, show that 


m= for j=0,1,:..,M. 

10.* A particle moves on a circle through points that have been marked 0, 1, 2, 3, 4 
(in a clockwise order). The particle starts at point 0. At each step it has probability q of moving 
one point clockwise (0 follows 4) and 1 — q of moving one point. counterclockwise. Let 
X,, (n -= 0) denote its location on the circle. {X,} is a Markov chain. 

(a) Find the transition probability matrix. 

(b) Find the steady-state probabilities. 


11. The leading brewery on the. West Coast (labeled A) has hired an operations research 
analyst to analyze its market position. It is particularly concerned about its major competitor 
(labeled B). The analyst believed that brand switching can be modeled as a Markov chain using 
three states, with. states A and. B representing. customers drinking beer produced from the 
aforementioned breweries and state C representing all other brands. Data are taken monthly, 
and the analyst has constructed the following transition matrix from past data. 


A B Cc 





A 10.7 0.2 0.1 
B | 0.2 0.75 0.05 
C i O0.1 0.1 0.8 


What are the steady-state market shares for the two major breweries? 


12. A manufacturer has a machine that, when it breaks down, requires a day to repair. 
The machine breaks down with probability p. Denote the state of the system as 0 when the 
machine is found to be operating at the end of the day; 1 when the machine is found to have 
broken down during the day; and 2 when the machine, having broken down the previous day, 
has spent the day being repaired. Show that the process forms a Markov chain and find the 
steady-state probabilities. l 


Oo 1 2 
O;l-p p 0 


1 0 0 
2 1 0 0 


= 


13. Suppose that the manufacturer in Prob. 12 keeps a spare machine that is used when 
the primary machine fails. Denote the state of the system by (x, y), where x and y take on the 
values 1 or 0 depending upon whether or not the machines are working at the end of the day: 

(a) Show that the process forms a Markov chain and find the transition matrix. 

` (b) Find the steady-state probabilities. 
Hint: Note that (0, 0) is not a possible state. 


14. A computer is inspected at the end of every hour. It is found to be either working 
(up) or failed (down). If the computer is found to be up, the probability of it remaining up for 
the next hour is 0.90. If it is down, repair action, which may require more than an hour, is 
taken. Whenever the machine is down (regardless of how long it has been down), the probability 
of it still being down an hour later is 0.35. 

(a) Show that this is a Markov chain and find the transition matrix. 

(b) Find the steady-state probabilities of the machine being up and down. 


15. Consider the following blood inventory problem facing a hospital. Suppose there is 
need for a rare blood type, e.g., type AB, Rh negative blood. Suppose the demand over a 
three-day period is given by 


P{D = 0} = 0.4, P{D = 1} = 0.3, P{D = 2} = 0.2, and P{D = 3} = 0.1. 


Note that the expected demand is then 1 unit. Suppose that there are three days between 
deliveries. The hospital proposes a policy of receiving one pint at each delivery and uses the 
oldest blood first, i.e., it uses a FIFO policy (first in, first out). If more blood is required than 
is on hand, an expensive emergency delivery is made. Blood is discarded if it is still on the 
shelf after 21 days. Denote the state of the system as the number of pints on hand just after a 
delivery. Noting that the largest state is 7: i 

(a) Find the transition matrix. 

(b) Find the steady-state probabilities. 


16. A soap company specializes in a luxury type of bath soap. The sales of this soap 
fluctuate between two levels—‘‘Low’’ and ‘‘High’’—depending upon two factors: (1) whether 
they advertise or not, and (2) the advertising and marketing of new products being done by 
competitors. The second factor is out of the company’s control, but they are trying to determine 
what their own advertising policy should be. For example, the marketing manager’s proposal 
is to advertise when sales are low but not to advertise when sales are high. Advertising in any 
quarter of a year has its primary impact on sales in the following quarter. Therefore, at the 
beginning of each quarter, the needed information is available to forecast accurately whether 
sales will be low or high that quarter and to decide whether to advertise that quarter. 

The cost of advertising is $1,000,000 for each quarter of a year in which it is done. 
When advertising during a quarter, the probability of having high sales the next quarter is å or 
2, depending upon whether the current quarter’s sales are low or high. These probabilities go 
down to ł or when advertising is not done during the current quarter. The company’s quarterly 
profits (excluding advertising costs) are $4,000,000 when sales are high, but only $2,000,000 
when sales are low. (Hereafter, use units of millions of dollars.) 

(a) Assuming the company always chooses not to advertise, determine the transition 
matrix. Assuming the company always chooses to advertise, determine the transition 
matrix. Assuming the company follows the marketing manager’s proposal, determine 
the transition matrix. 

(b) Determine the steady-state probabilities for each of the three cases in part (a). 


17. Consider the camera inventory problem presented in Sec. 15.2 except that demand 
now has the following probability distribution: 


PID = 0} = 4, 
P{D = 1} = 2, 
PID = 2} = 4, 
P{D = 3} = 0. 


The inventory policy is still an (s, S$) policy with s = 1 and S = 2. Assume that there is one 
camera in stock at the time the policy is instituted. Noting that there are three states: 
(a) Find the transition matrix. 
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(b) .Find the steady-state probabilities. 

(c) Assuming that the store pays a storage cost for each camera remaining on the shelf 
at the end of the week according to the function C(O) = 0, C(1) = $2, and 
C(2) = $8, find the long-run expected average holding cost per week. 


18. A production process contains a machine that deteriorates rapidly in both quality 
and output under heavy usage, so that it is inspected at the end of each day. Immediately after 
inspection, the condition of the machine is noted and classified into one of four possible states: 





State Condition 
0 Good as new 
1 Operable — minimum deterioration 
2 Operable—major deterioration 
3 Inoperable and replaced by a good-as-new machine 


The process can be modeled as a Markov chain with transition matrix given by 





State | 0 1 2 3 
0 |0 &€ wt we 
1 0 i d ¢ 
2 0 0 4 $ 
3 1 0 0 0 





(a) Find the steady-state probabilities. 
(b) If the costs of being in states 0, 1, 2, 3 are 0, $1,000, $3,000, and $6,000, respec- 
tively, what is the long-run expected average cost per day? 


19. Using ordering costs and unsatisfied demand costs, evaluate the expected average 
inventory cost per week for the inventory example introduced in Sec. 15.7, using the (s, $) 
policy, (s, S) = (2, 3). 


20.* Consider the inventory example introduced in Sec. 15.2. Instead of following an 
(s, S) policy, we use a (q, Q) policy. If the stock level at the end of each period is less than 
q = 2 units, Q = 2 additional units will be ordered. Otherwise, no ordering will. take place. 
This policy is a (q, Q) policy with q = 2 and Q = 2. Let X, denote the number of units on 
hand at the end of the tth period. Assume that demand that is not filled results in lost sales. 
{X,,} is a Markov chain (assume X, = 0). Using the cost values and demand distribution given 
for the inventory example in the text: 

(a) Find the steady-state probabilities. 

(b) Find the long-run expected average cost per unit time. 


21. Consider the following (k, Q) inventory policy. Let D,, Da, ... , be the demand 
for a product in periods 1, 2, ... , respectively. If the demand during a period exceeds the 
number of items available, this unsatisfied demand is backlogged; i.e., it is filled when the 
next order is received. Let Z, (n = 0, 1, . . .) denote the amount of inventory on hand minus 
the number of units backlogged before ordering at the end of period n (Zo = 0). If Z, is zero 
Or positive, no orders are backlogged. If Z, is negative, then —Z,, represents the number of 
backlogged units and no inventory is on hand. If at the end of period n, Z, < k = 1, an order 
is placed for 2m (Qm in general) units, where m is the smallest integer such that Z, + 2m = 
1. (The amount ordered is the smallest integral multiple of 2, which brings the level to at least 
1 unit.) Let D,, be independent and identically distributed random variables taking on the values, 
0, 1, 2, 3, 4, each, with probability $. Let X,, denote the amount of stock on hand after ordering 


at the end of period n (Xo = 2). It is evident that 


y = Xn- Dnt 2m, MR D,<1 
n |X, — Dp if X,_, — D, 21 


and {X,} (n = 0, 1, . . .) is a Markov chain with only two states: 1 and 2. [The only time that 
ordering will take place is when Z, = 0, —1, —2, or —3, in which case 2, 2, 4, and 4 units 
are ordered, respectively, leaving X, = 2, 1, 2, 1, respectively. In general, for any (k, Q) 
policy, the possible states are k, k + 1,k + 2,...,k +Q-1.] 

(a) Find the one-step transition matrix. 

(b) Find the stationary probabilities (see Prob. 9). 

(c) Suppose that the ordering cost is given by (2 + 2m) if an order is placed and zero 
otherwise. The holding cost per period is Z,, if Z, = 0 and zero otherwise. The 
shortage cost per period is —4Z,, if Z, < 0 and zero otherwise. Find the (long-run) 
expected average cost per unit time. 


22. An important unit consists of two components placed in parallel. The unit performs 
satisfactorily if one of the two components is operating. A component breaks down in a given 
period with probability g. Assume that the component breaks down only at the end of a period. 
When this occurs, the parallel component takes over, if available, beginning at the next period. 
Only one servicewoman is assigned to service each component in need of repair, and it takes 
two periods to complete the servicing. Let X, be a vector consisting of two elements U and V. 
U represents the number of components operating at the end of the tth period. V takes on the 
value 1 if the servicewoman requires only one additional period to complete a repair, if she is 
so engaged, and zero otherwise. Thus the state space consists of the four states (2, 0), (1, 0), 
(0, 1), and (1, 1). For example, the state (1, 1) implies that one component is operative and 
the other component needs an additional period for repair before becoming operative. Denote 
these four states by 0, 1, 2, 3, respectively. {X,} (¢ = 0, 1, .. .) is a Markov chain [assume 
that Xo is the vector (2, 0); that is, Xọ = 0] with transition matrix, 


l-q 400 


_ | 9 0 q1-@ 
BG 100 : 
l-q q4 0 0 


(a) What is the probability that there is a waiting line of length 1 (a component needing . 


service but not being worked on) at the end of a current period? 

(b) What are the steady-state probabilities? ` 

(c) If it costs $30,000 per period when the unit is inoperable (both components down) 
and zero otherwise, what is the (long-run) expected average cost per period? 


23. Consider the following gambler’s ruin problem. A gambler bets one unit on each 
play of a game. She has a probability p of winning and q = 1 — p of losing. She will continue 
to play until she goes broke or nets a fortune of T units. Let X, denote the gambler’s fortune 
on the nth play of the game. Then 


_ |X, + 1, with probability p 
Fnsi fe — 1, with probability q = 1 — p, FOLD Se 
Xari = Xp for X, = 0 or T. 


{X,} is a Markov chain. Assume that successive plays of the game are independent and that the 
gambler has an initial fortune of Xp. 

(a) Determine the one-step transition matrix of the Markov chain. 

(b) Find the classes of the Markov chain. 

(c) Let T = 3 and p = 0.3. Find fio, firs foo. for- 
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(d) Let T = 3 and p = 0.7. Find fio, fir, foo. for- 
What can you conclude from (c) and (d)? 


24. A video recorder manufacturer is so certain of its quality control that it is offering 
a complete replacement warranty if the set fails within two years. Based upon compiled data, 
the company has noted that only 1 percent of its recorders fail during the first year and 5 percent 
fail during the second year. The warranty does not cover replaced recorders. 

(a) Formulate this problem as a Markov chain and determine the transition matrix. 

(b) Find the probability that the manufacturer will have to honor the warranty. 


25. Consider the repairer problem presented in Sec. 15.9. Derive the expressions for 
the steady-state probabilities when there are M machines in the system. 


26. Consider a single-server queueing system, in which customers arrive according to 
a Poisson input process with parameter A (see Sec. 16.6), and the service times for the respective 
calling units are independent and identically distributed random variables. For n = 
1, 2, ... , let X,, denote the number of calling units in the system at the moment ¢, when the 
nth calling unit to be served (over a certain time interval) has finished being served. The 
sequence of time {r,,} corresponding to the moments when successive calling units depart service 
are called regeneration points. Furthermore, {X,,}, which represents the number of calling units 
in the system at the corresponding sequence of time {t,}, is a Markov chain and is known as 
an imbedded Markov chain. Imbedded Markov chains are useful for studying the properties of 
continuous time parameter stochastic processes. 

Now consider the particular special case where the service time of successive calling 
units is a fixed constant, say, 10 minutes, and the mean arrival rate is one every 50 minutes. 
To obtain a finite number of states, assume as an approximation that, if there are four calling 
units in the system, the system becomes saturated so that additional arrivals are turned away. 
Therefore, {X,,} is an imbedded Markov chain with state 0, 1, 2, or 3. (Because there are never 
more than four calling units in the system, there can never be more than three in the system at 
a regeneration point.) Because the system is observed at successive departures, X„ can never 
decrease by more than 1. Furthermore, the probabilities of transitions that result in increases 
in X,, are obtained directly from the Poisson distribution. 

(a) Find the one-step transition matrix. (In obtaining the transition probability from state 

3 to state 3, use the probability of one or more arrivals rather than just one arrival, 
and similarly for other transitions to state 3.) 

(b) Find the steady-state probabilities for the number of calling units in the system at 

regeneration points. 

(c) Compute the expected number of calling units in the queueing system at regeneration 

points, and compare it to the value of L for the single-server model in Sec. 16.7. 


16 


Queueing Theory 


Queueing theory involves the mathematical study of queues, or waiting lines. The 
formation of waiting lines is, of course, a common phenomenon that occurs whenever 
the current demand for a service exceeds the current capacity to provide that service. 
Decisions regarding the amount of capacity to provide must be made frequently in 
industry and elsewhere. However, because it is often impossible to predict accurately 
when units will arrive to seek service and/or how much time will be required to 
provide that service, these decisions often are difficult ones. Providing too much 
service involves excessive costs. On the other hand, not providing enough service 
capacity causes the waiting line to become excessively long at times. Excessive wait- 
ing also is costly in some sense, whether it be a social cost, the cost of lost customers, 
the cost of idle employees, or some other important cost. Therefore, the ultimate goal 
is to achieve an economic balance between the cost of service and the cost associated 
with waiting for that service. Queueing theory itself does not solve this problem 
directly; however, it does contribute vital information required for such decisions by 
predicting such various characteristics of the waiting line as the average waiting time. 
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Queueing theory provides a large number of alternative mathematical models 
for describing a waiting-line situation. Mathematical results that predict some of the 
characteristics of the waiting line often are available for these models. After some 
general discussion, this chapter presents most of the more elementary models and 
their basic results. Chapter 17 discusses how the information provided by queueing 
theory might be used for making decisions. 


16.1 Prototype Example 


The emergency room of COUNTY HOSPITAL provides quick medical care for emer- 
gency cases brought to the hospital by ambulance or private automobile. At any hour 
there is always one doctor on duty in the emergency room. However, because of a 
growing tendency for emergency cases to usé these facilities rather than go to a private 
physician, the hospital has been experiencing a continuing increase in the number of 
emergency room visits each year. As a result, it has become quite common for patients 
arriving during peak usage hours (the early evening) to have to wait until it is their 
turn to be treated by the doctor. Therefore, a proposal has been made that a second 
doctor should be assigned to the emergency room during these hours, so that two 
emergency cases can be treated simultaneously. The hospital’s Management Engineer 
has been assigned to study this question.! 

The Management Engineer began by gathering the relevant historical data and 
then projecting these data into the next year. Recognizing that the emergency room 
is a queueing system, he applied several alternative queueing theory models to predict 
the waiting characteristics of the system with one doctor and with two doctors—as 
you will see in the latter sections of this chapter (see Tables 16.2, 16.3, and 16.4). 


16.2 Basic Structure of Queueing Models 


The Basic Queueing Process 


The basic process assumed by most queueing models is the following. Customers 
requiring service are generated over time by an input source. These customers enter 
the queueing system and join a queue. At certain times a member of the queue is 
selected for service by some rule known as the queue discipline (or service discipline). 
The required service is then performed for the customer by the service mechanism, 
after which the customer leaves the queueing system. This process is depicted in 
Fig. 16.1. 

Many alternative assumptions can be made about the various elements of the 
queueing process; they are discussed next. 


Input Source (Calling Population) 


One characteristic of the input source is its size. The size is the total number of 
customers that might require service from time to time, i.e., the total number of 


1 For one actual case study of this kind, see Bolling, W. Blaker: ‘‘Quetieing Model of a Hospital Emergency 
Room,” Industrial Engineering, pp. 26-31, September 1972. 
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Figure 16.1 The basic queueing process. 


distinct potential customers. This population from which arrivals come is referred to 
as the calling population. The size may be assumed to be either infinite or finite (so 
that the input source also is said to be either unlimited or limited). Because the 
calculations are far easier for the infinite case, this assumption often is made even 
when the actual size is some relatively large finite number, and it should be taken to 
be the implicit assumption for any queueing model that does not state otherwise. The 
finite case is more difficult analytically because the number of customers in the 
queueing system affects the number of potential customers outside the system at any 
time. However, the finite assumption must be made if the rate at which the input 
source generates new customers is significantly affected by the number of customers 
in the queueing system. 

The statistical pattern by which customers are generated over time must also be 
specified. The common assumption is that they are generated according to a Poisson 
process; i.e., the number of customers generated until any specific time has a Poisson 
distribution. As we discuss in Sec. 16.4, this case is the one where arrivals to the 
queueing system occur randomly but at a certain fixed mean rate, regardless of how 
many customers already are there (so the size of the input source is infinite). An 
equivalent assumption is that the probability distribution of the time between consec- 
utive arrivals is an exponential distribution. (The properties of this distribution are 
described in Sec. 16.4.) The time between consecutive arrivals is referred to as the 
interarrival time. 

Any unusual assumptions about the behavior of the customers must also be 
specified. One example is balking, where the customer refuses to enter the system 
and is lost if the queue is too long. 


Queue 


A queue is characterized by the maximum permissible number of customers that it 
can contain. Queues are called infinite or finite, according to whether this number is 
infinite or finite. The assumption of an infinite queue is the standard one for most 
queueing models, even for situations where there actually is a (relatively large) finite 
upper bound on the permissible number of customers, because dealing with such an 
upper bound would be a complicating factor in the analysis. However, for queueing 
systems where this upper bound is small enough that it actually would be reached 
with some frequency, it becomes necessary to assume a finite queue. 
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Queue Discipline 


The queue discipline refers to the order in which members of the queue are selected 
for service. For example, it may be first-come-first-served, random, according to some 
priority procedure, or some other order. First-come-first-served usually is assumed by 
queueing models unless stated otherwise. 


Service Mechanism 


The service mechanism consists of one or more service facilities, each of which 
contains one or more parallel service channels, called servers. If there is more than 
one service facility, the customer may receive service from a sequence of these (service 
channels in series). At a given facility, the customer enters one of the parallel service 
channels and is completely serviced by that server. A queueing model must specify 
the arrangement of the facilities and the number of servers (parallel channels) at each 
one. Most elementary models assume. one service facility with either one or a finite 
number of servers. 

The time elapsed from the commencement of service to its completion for a 
customer at a service facility is referred to as the service time (or holding time). A 
model of a particular queueing system must specify the probability distribution of 
service times for each server (and possibly for different types of customers), although 
it is common to assume the same distribution for all servers (all models in this chapter 
make this assumption). The service-time distribution that is most frequently assumed 
in practice (largely because it is far more tractable than any other) is the exponential 
distribution discussed in Sec. 16.4, and most of our models will be of this type. Other 
important service-time distributions are the degenerate distribution (constant service 
time) and the Erlang (gamma) distribution, as illustrated by models in Sec. 16.7. 


An Elementary Queueing Process 


As we have already suggested, queueing theory. has been applied to many different 
types of waiting-line situations. However, the most prevalent type of situation is the 
following: A single waiting line (which may be empty at times) forms in the front of 
a single service facility, within which are stationed one or more servers. Each customer 
generated by an input source is serviced by one of the servers, perhaps after some 
waiting in the queue (waiting line). The queueing system involved is depicted in 
Fig. 16.2. 

Notice that the queueing process in the illustrative example of Sec. 16.1 is of 
this type. The input source generates customers in the form of emergency cases 
requiring medical care. The emergency room is the service facility, and the doctors 
are the servers. 

A server need not be a single individual; it may be a group of persons, e.g., a 
repair crew that combines forces to perform simultaneously the required service for a 
customer. Furthermore, servers need not even be people: In many cases, a server may 
be a machine or a piece of equipment, e.g., a forklift that performs a given service 
on call (although probably with human guidance). By the same token, the customers 
in the waiting line need not be people. For example, they may be items waiting for 
a certain operation by a given type of machine, or they may be cars waiting in front 
of a toll booth. 
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Figure 16.2 An elementary queueing system (each customer is indicated by a C and each server by 
an S). 


It is not necessary that there actually be a physical waiting line forming in front 
of a physical structure that constitutes the service facility; that is, the members of the 
queue may be scattered throughout an area waiting for a server to come to them, e.g., 
machines waiting to be repaired. The server or group of servers assigned to a given 
area constitute the service facility for that area. Queueing theory still gives the average 
number waiting, the average waiting time, and so on, because it is irrelevant whether 
or not the customers wait together in a group. The only essential requirement for 
queueing theory to be applicable is that changes in the number of customers waiting 
for a given service occur just as though the physical situation described in Fig. 16.2 
(or a legitimate counterpart) prevails. 

Except for Sec. 16.9, all the queueing models discussed in this chapter are of 
the elementary type depicted in Fig. 16.2. Many of these models further assume that 
all interarrival times are independent and identically distributed and that all service 


times are independent and identically distributed. Such models conventionally are 
labeled as follows: 


Distribution of service times 


~/=/- <(_ Number of servers 


CY 


Distribution of interarrival times, 


where M = exponential distribution (Markovian), as described in Sec. 16.4, 
D = degenerate distribution (constant times), as discussed in Sec. 16.7, 
E, = Erlang distribution (shape parameter = k), as described in Sec. 16.7, 
G = general distribution (any arbitrary distribution allowed),! as discussed 
in Sec. 16.7. 


For example, the M/M/s model discussed in Sec. 16.6 assumes that both interarrival 
times and service times have an exponential distribution and that the number of servers 


' When referring to interarrival times, it is conventional to replace the symbol G by GI = general inde- 
pendent distribution. 
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is s (any positive integer). The M/G/1 model discussed again in Sec. 16.7 assumes 
that interarrival times have an exponential distribution, but it places no restriction on 
what the distribution of service times must be, whereas the number of servers is 
restricted to be exactly one. Various other models that fit this labeling scheme also 
are introduced in Sec. 16.7. 


Terminology and Notation 
Unless otherwise noted, the following standard terminology and notation will be used: 


State of the system = number of customers in queueing system. 
Queue length = number of customers waiting for service 
= state of system minus number of customers being 

served. 

N(t) = number of customers in queueing system at time 
t (t = 0). 

P(t) = probability that exactly n customers are in queueing 
system at time ¢, given number at time 0. 

s = number of servers (parallel service channels) in 

queueing system. 

A, = mean arrival rate (expected number of arrivals per unit 
time) of new customers when n customers are in 
system. 

H, = mean service rate for overall system (expected number 
of customers completing service per unit time) when n 
customers are in system. Note: p, represents combined 
rate at which all busy servers (those serving customers) 
achieve service completions. 

A, p, p: see following paragraph. 


When A,, is a constant for all n, this constant is denoted by A. When the mean 
service rate per busy server is a constant for all n = 1, this constant is denoted by 
y. (In this case, u, = su when n = s, that is, when all s servers are busy.) Under 
these circumstances, 1/A and 1/ are the expected interarrival time and the expected 
service time, respectively. Also, p = A/sp is the utilization factor for the service 
facility, i.e., the expected fraction of time the individual servers are busy, because 
A/sp represents the fraction of the system’s service capacity (sj) that is being utilized 
on the average by arriving customers (A). 

Certain notation also is required to describe steady-state results. When a 
queueing system has recently begun operation, the state of the system (number of 
customers in the system) will be greatly affected by the initial state and the time that 
has since elapsed. The system is now said to be a transient condition. However, 
after sufficient time has elapsed, the. state of the system becomes essentially inde- 
pendent of the initial state and the elapsed time (except under unusual circumstances). ! 
The system has now essentially reached a steady-state condition, where the proba- 
bility distribution of the state of the system remains the same (the steady-state or 
stationary distribution) over time. Queueing theory has tended to focus largely on the 
steady-state condition, partially because the transient case is more difficult analyti- 


1 When A and u are defined, these unusual circumstances are that p = 1, in which case the state of the 
system tends to grow continually larger as time goes on. 


cally. (Some transient results exist, but they are generally beyond the technical scope 
of this book.) The following notation assumes that the system is in a steady-state 
condition: 


ll 


probability that exactly n customers are in queueing system. 
expected number of customers in queueing system. 

expected queue length (excludes customers being served). 
waiting time in system (includes service time) for each individual 
customer, 

ECW). 

waiting time in queue (excludes service time) for each individual 
customer. 

= EW 
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Relationships between Z, W, L,, and W, 


Assume that À„ is a constant A for all n. It has been proven that in a steady-state 
queueing process, 


L = àW. 


(Because John D. C. Little’ provided the first rigorous proof, this equation sometimes 
is referred to as Little’s formula.) Furthermore, the same proof also shows that 
L, = AW,. 

If the A, are not equal, then A can be replaced in these equations by A, the 
average arrival rate over the long run. (We shall show later how A can be determined 
for some basic cases.) 

Now assume that the mean service time is a constant, 1/4, for all n = 1. It 
then follows that 

W= W + Z 
H 

These relationships are extremely important because they enable all four of the 
fundamental quantities—L, W, Ly and W,—to be immediately determined as soon as 
one of them is found analytically. This situation is fortunate because some of these 
quantities often are much easier to find than others when solving a queueing model 
from basic principles. 


16.3 Examples of Real Queueing Systems 


It may appear that our description of queueing systems in the preceding section is 
relatively abstract and applicable to only rather special practical situations. On the 
contrary, queueing systems are surprisingly prevalent in a wide variety of contexts. 
To broaden your horizons on the applicability of queueing theory, we shall briefly 
mention various examples of real queueing systems. 

One important class of queueing systems that we all encounter in our daily lives 
is commercial service systems, where outside customers receive service from com- 


' Little, John D. C.: “A Proof for the Queueing Formula: L = AW,” Operations Research, 9(3):383-387, 
1961; also see Stidham, Shaler, Jr.: ‘A Last Word on L = AW,” Operations Research, 22(2):417-421, 
1974. 
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mercial organizations. Many of these involve person-to-person service at a fixed lo- 
cation, such as a barber shop (the barbers are the servers), bank teller service, check- 
out stands at a grocery store, and a cafeteria line (service channels in series). However, 
many others do not, such as home appliance repairs (the server travels to the cus- 
tomers), a vending machine (the server is a machine), and a gas station (the cars are 
the customers). 

Another important class is transportation service systems. For some of these 
systems the vehicles are the customers, such as cars waiting at a toll booth or traffic 
light (the server), a truck or ship waiting to be loaded or unloaded by a crew (the 
server), and airplanes waiting to land or take off from a runway (the server). (An 
unusual example of this kind is a parking lot, where the cars are the customers and 
the parking spaces are the servers, but there is no queue because arriving customers 
go elsewhere to park if the lot is full.) In other cases, the vehicles, such as taxicabs, 
fire trucks, and elevators, are the servers. 

In recent years, queueing theory probably has been applied most to. business- 
industrial internal service systems, where the customers receiving service are inter- 
nal to the organization. Examples include materials-handling systems, where mate- 
rials-handling units (the servers) move loads (the customers); maintenance systems, 
where maintenance crews (the servers) repair machines (the customers); and inspection 
Stations, where quality control inspectors (the servers) inspect items (the customers). 
Employee facilities and typing pools also fit into this category. In addition, machines 
can be viewed as servers whose customers are the jobs being processed. A related 
example of great importance is a computer facility, where the computer is viewed as 
the server. 

There is now growing recognition that queueing theory also is applicable to 
social service systems. For example, a judicial system is a queueing network, where 
the courts are service facilities, the judges (or panels of judges) are the servers, and 
the cases waiting to be tried are the customers. A legislative system is a similar 
queueing network, where the customers are the bills waiting to be processed. Various 
health-care systems also are queueing systems. You already have seen one example 
in Sec. 16.1 (a hospital emergency room), but you can also view ambulances, x-ray 
machines, and hospital beds as servers in their own queueing systems. Similarly, 
families waiting for low- and moderate-income housing, or other social services, can 
be viewed as customers in a queueing system. 

Although these are four broad classes of queueing systems, they still do not 
exhaust the list. In fact, queueing theory first began early in this century with appli- 
cations to telephone engineering (the founder of queueing theory, A. K. Erlang, was 
an employee of the Danish Telephone Company in Copenhagen), and telephone en- 
gineering still is an important application: Furthermore, we all have our own personal 
queues—homework assignments, books to be read, and so forth. However, these 
examples are sufficient to suggest that queueing systems do indeed pervade many 
areas of society. 


16.4 The Role of the Exponential Distribution 


The operating characteristics of queueing systems are determined largely by two sta- 
tistical properties, namely, the probability distribution of interarrival times (see ‘‘Input 
Source’’ in Sec. 16.2) and the probability distribution of service times (see ‘‘Service 


Mechanism’’ in Sec. 16.2). For real queueing systems, these distributions can take 601 
on almost any form. (The only restriction is that negative values cannot occur.) How- Queueing Theory 
ever, to formulate a queueing theory model as a representation of the real system, it 
is necessary to specify the assumed form of each of these distributions. To be useful, 
the assumed form should be sufficiently realistic, so that the model provides reason- 
able predictions while, at the same time, being sufficiently simple, so that the model 
is mathematically tractable. Based on these considerations, the most important prob- 
ability distribution in queueing theory is the exponential distribution. 
Suppose that a random variable T represents either interarrival or service times. 
(We shall refer to the occurrences marking the end of these times—arrivals or service 
completions —as events.) T is said to have an exponential distribution with parameter 
a if its probability density function is 


ae~™, fort =0 
0 


osk fort < 0, 


as shown in Fig. 16.3. In this case, the cumulative probabilities are 


PITsh=1-e7%, 
PIT > th = e7™, 


and the expected value and variance of T are 
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What are the implications of assuming that T has an exponential distribution for 
a queueing model? To explore this question, let us examine six key properties of the 
exponential distribution. 


Property 1: f (t) is a strictly decreasing function of t (t = 0). 


One consequence of Property 1 is that 
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Figure 16.3 Probability density function for the exponential distribution. 
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for any strictly positive values of At and t. [This consequence follows from the fact 
that these probabilities are the area under the f,(t) curve over the indicated interval 
of length Az, and the average height of the curve is less for the second probability 
than for the first.] Therefore, it is not only possible but relatively likely that T will 
take on a small value near zero. In fact, 


il 
pfo<rsti| = 0.393, 
2a 


whereas P {3 1 sT=s 314 = 0.383, 

2a 2a 
so that the value T takes on is more likely to be ‘‘smail’’ [i.e., less than half of E(T)] 
than ‘‘near’’ its expected value [i.e., no further away than half of E(T)], even though 
the second interval is twice as wide as the first. 

Is this really a reasonable property for T in a queueing model? If T represents 
service times, the answer depends upon the general nature of the service involved, as 
discussed next. 

If the service required is essentially identical for each customer, with the server 
always performing the same sequence of service operations, then the actual service 
times tend to be near the expected service time. Small deviations from the mean may 
occur, but usually because of only minor variations in the efficiency of the server. A 
small service time far below the mean is essentially impossible, because a certain 
minimum amount of time is needed to perform the required service operations even 
when the server is working at top speed. The exponential distribution clearly does not 
provide a close approximation to the service-time distribution for this type of situation. 

On the other hand, consider the type of situation where the specific tasks required 
of the server differ among the customers. The broad nature of the service may be the 
same, but the specific type and amount of service differ. For example, this would be 
the case in the County Hospital emergency room problem discussed in Sec. 16.1. The 
doctors encounter a wide variety of medical problems. In most cases, they can provide 
the required treatment rather quickly, but an occasional patient requires extensive 
care. Similarly, bank tellers and grocery store checkout clerks are other servers of 
this general type, where the required service is often brief but must occasionally be 
extensive. An exponential service-time distribution would seem quite plausible for 
this type of service situation. 

If T represents interarrival times, Property 1 rules out situations where potential 
customers approaching the queueing system tend to postpone their entry if they see 
another customer entering ahead of them. On the other hand, it is entirely consistent 
with the common phenomenon of arrivals occurring ‘‘randomly,’’ described by sub- 
sequent properties. 


Property 2: Lack of memory. 


This property can be stated mathematically as 
PIT >t + AT > A} = PIT >} 


for any positive quantities t and At. In other words, the probability distribution of the 
remaining time until the event (arrival or service completion) occurs always is the 
same, regardless of how much time (Afr) already has passed. In effect, the process 


‘‘forgets’’ its history. This surprising phenomenon occurs with the exponential dis- 
tribution because 


PIT > At, T>t + At} 
PIT > Att 


PIT >t + Att 
PIT > At 
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For interarrival times, this property describes the common situation where the 
time until the next arrival is completely uninfluenced by when the last arrival occurred. 
For service times, the property is more difficult to interpret. We should not expect it 
to hold in a situation where the server must perform the same fixed sequence of 
operations for each customer, because then a long elapsed service should imply that 
probably little remains to be done. On the other hand, in the type of situation where 
the required service operations differ among the customers, the mathematical statement 
of the property may be quite realistic. For this case, if considerable service has already 
elapsed for a customer, the only implication may be that this particular customer 
requires more extensive service than most. 


Property 3: The minimum of several independent exponential random variables 
has an exponential distribution. 


To state this property mathematically, let T}, T3, . . . , T, be independent ex- 
ponential random variables with parameters a,, @2, ... , @,, respectively. Also let 
U be the random variable that takes on the value equal to the minimum of the values 
actually taken on by T, Ta, ... , Tp; that is, 


U = minimum {7,, T,,... , T,}. 


Thus, if 7; represents the time until a particular kind of event will occur, then U 
represents the time until the first of the n different events will occur. Now note that 
for any t= 0, 


P{U > ġ = PAT, >14,T,>t...,7,>h 


PIT, > 8P{T, > tpe- o P{T, > ġ 


me mE . Ant 
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so that U indeed has an exponential distribution with parameter 
n 
en 
i=1 


This property has some implications for interarrival times in queueing models. 
In particular, suppose that there are several (n) different types of customers, but the 
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interarrival times for each type (type i) have an exponential distribution with parameter 
a; (i = 1, 2,...,n). By Property 2, the remaining time from any. specified instant 
until the next arrival of a customer of type i would have this same distribution. 
Therefore, let T; be this remaining time measured from the instant a customer of any 
type arrives. Property 3 then tells us that U, the interarrival times for the queueing 
system as a whole, has an exponential distribution with parameter a defined by the 
last equation. As a result, you can choose to ignore the distinction between customers 
and still have exponential interarrival times for the queueing model. 

However, the implications are even more important for service times in queueing 
models having more than one server than they are for interarrival times. For example, 
consider the situation where all the servers have the same exponential service-time 
distribution with parameter u. For this case, let n be the number of servers currently 
providing service, and let T, be the remaining service time for server i (i = 
1, 2,...,), which also has an exponential distribution with parameter a, = mw. It 
then follows that U, the time until the next service completion from any of these 
servers, has an exponential. distribution with parameter a = ny. In effect, the 
queueing system currently would be performing just like a single-server system, where 
service times have an exponential distribution with parameter nu. We shall make 
frequent use of this implication for analyzing multiple-server models later in the 
chapter. 


Property 4: Relationship to the Poisson distribution. 


Suppose that the time between consecutive occurrences of some particular kind 
of event (e.g., arrivals or service completions by a continuously busy server) has an 
exponential distribution with parameter a. Property 4 then has to do with the resulting 
implication about the probability distribution of the number of times this kind of event 
occurs over a specified length of time. In particular, let X(t) be the number of occur- 
rences by time t (t > 0), where time 0 designates the instant at which the count begins. 
The implication is that 


eyes 


PIXO = n} nl 


o forn = 0,1,2,...; 


that is, X(t) has a Poisson distribution with parameter &t. For example, with n = 0, 
P{X(t) = O} = e7”, 


which is just the probability from the exponential distribution that the first event occurs 
after time t. The mean of this Poisson distribution is 


EXX()} = at, 


so that the expected number of events per unit time is a. Thus a is said to be the 
mean rate at which the events occur. When the events are counted on a continuing 
basis, the counting process {X(t); t > 0} is said to be a Poisson process with parameter 
a (the mean rate). 

This property provides useful information about service completions when 
service times have an exponential distribution with parameter u. We obtain this in- 
formation by defining X(t) as the number of service completions achieved by a con- 
tinuously busy server in elapsed time t, where œ = u.. For multiple-server queueing 
models, X(t) can also be defined. as the number of service completions achieved by n 
continuously busy servers in elapsed time t, where a = np. 


The property is particularly useful for describing the probabilistic behavior of 
arrivals when interarrival times have an exponential distribution with parameter A. In 
this case, X(t) would be the number of arrivals in elapsed time t, where a = A is the 
mean arrival rate. Therefore, arrivals occur according to a Poisson input process 
with parameter A. Such queueing models also are described as assuming a Poisson 
input. 

Arrivals sometimes are said to occur randomly, meaning that they occur in 
accord with a Poisson input process. One intuitive interpretation of this phenomenon 
is that every time period of fixed length has the same chance of having an arrival 
regardless of when the preceding arrival occurred, as suggested by the following 
property. 


Property 5: For all positive values of t, PIT < t + AT > } ~ aAt, for 
small At. 


Continuing to interpret T as the time from the last event of a certain type (arrival 
or service completion) until the next such event, suppose that a time t already has 
elapsed without the event occurring. We know from Property 2 that the probability 
that the event will occur within the next time interval of fixed length Ar is a constant 
(identified in the next paragraph), regardless of how large or small ¢ is. Property 5 
goes further to say that, when the value of Aż is small, this constant probability can 
be approximated very closely by a At. Furthermore, when considering different small 
values of Ar, this probability is essentially proportional to At, with proportionality 
factor a. In fact, œ is the mean rate at which the events occur (see Property 4), so 
that the expected number of events in the interval of length Ar is exactly a At. The 
only reason that the probability of an event occurring differs slightly from this value 
is the possibility that more than one event will occur, which has negligible probability 
when Ar is small. 

To see why Property 5 holds mathematically, note that the constant value of 
our probability (for a fixed value of Ar > 0) is just 


PIT = t + AT > = Pir = At 
-1- eT ost 
for any t = 0. Therefore, because the series expansion of e* for any exponent x is 
fos} x” 
*=]+x+ %5, 
e x 2 al 


it follows that 


PTst\+AiT> Hh =1-1+eaAr- Se 
n=2 n: 


=~ a At, for small Ar,! 


because the summation terms become relatively negligible for sufficiently small values 
of a At. 


1 More precisely, 


_ rst t+ AlT> A 
lim ————_-——— = à 
At>0 At 


605 
Queueing Theory 


606 
Probabilistic Models 


Because T can represent either interarrival or service times in queueing models, 
this property provides a convenient approximation of the probability that the event of 
interest occurs in the next. small interval (At) of time. An analysis based on this 
approximation also can be made exact by taking appropriate limits as At — 0. 


Property 6: Unaffected by aggregation or disaggregation. 


This property is relevant primarily for verifying that the input process is Poisson. 
Therefore, we shall describe it in these terms, although it also applies directly to the 
exponential distribution (exponential interarrival times) because of Property 4. 

Suppose that there are several (n) different types of customers, where the cus- 
tomers of each type (type i) arrive according to a Poisson input process with parameter 
A; (i = 1,2,...,n). Assuming that these are independent Poisson. processes, the 
property says that the aggregate input process (arrival of all customers without regard 
to type) also must be Poisson, with parameter (arrival rate) A = A, + Ap tc + 
A, In other words, having a Poisson process is unaffected by aggregation. 

This part of the property follows directly from Properties 3 and 4. The latter 
property implies that the interarrival times for customers of type i have an exponential 
distribution with parameter A,;. For this identical situation, we already discussed for 
Property 3 that it implies that the interarrival times for all customers also must have 
an exponential distribution, with parameter A = A, + A, + <- - + A,. Using Property 
4 again then implies that the aggregate input process is Poisson. 

The second part of Property 6 (‘‘unaffected by disaggregation’’) refers to the 
reverse case, where the aggregate input process is known to be Poisson with parameter 
A, but the question now is the nature of the disaggregated input processes for the 
individual customer types. Assuming that each arriving customer has a fixed proba- 
bility p; of being of type i (i = 1,2,...,n), with 


à= pà and > p= 1, 


the property says that the input process for customers of type i also must be Poisson, 
with parameter A;. In other words, having a Poisson process is unaffected by disag- 
gregation. 

As one example of the usefulness of this second part of the property, consider 
the following situation. Indistinguishable customers arrive according to a Poisson 
process with parameter A. Each arriving customer has a fixed probability p of balking 
(leaving without entering the queueing system), so the probability of entering the 
system is (1 — p). Thus there are two types of customers—those that balk and those 
that enter the system. The property says that each type arrives according to a Poisson 
process, with parameters pA and (1 — p)A, respectively: Therefore, by using the latter 
Poisson process, queueing models that assume a Poisson input process can still be 
used to analyze the performance of the queueing system for those customers that enter 
the system. 


16.5 The Birth-and-Death Process 


Most elementary queueing models assume that the inputs (arriving customers) and 
outputs (leaving customers) of the queueing system occur according to the birth-and- 
death process. This important process in probability theory has applications in various 





My Me M3 Hn Hn Hng 
Figure 16.4 Rate diagram for the birth-and-death process. 


areas. However, in the context of queueing theory, the term birth refers to the arrival 
of a new customer into the queueing system, and death refers to the departure of a 
served customer. The state of the system at time t (£ = 0), M(t), is the number of 
customers in the queueing system at time t. The birth-and-death process describes 
probabilistically how N(t) changes as t increases. Broadly speaking, it says that in- 
dividual births and deaths occur randomly, where their mean occurrence rates depend 
only upon the current state of the system. More precisely, the assumptions of the 
birth-and-death process are the following: 


ASSUMPTION 1: Given M(t) = n, the current probability distribution of the 
remaining time until the next birth (arrival) is exponential with parameter A,, 
(n = 0,1,2,...). 


ASSUMPTION 2: Given M(t) = n, the current probability distribution of the re- 
maining time until the next death (service completion) is exponential with parameter 


Han = 1,2,...). 
ASSUMPTION 3: Only one birth or death can occur at a time. 


Because of Assumptions 1 and 2, the birth-and-death process is a special type 
of continuous time Markov chain. (See Sec. 15.9 for a description of continuous time 
Markov chains and their properties.) Queueing models that can be represented by a 
continuous time Markov chain are far more tractable analytically than any other, and 
Assumption 3 simplifies the analysis considerably further. 

Because Property 4 for the exponential distribution (see Sec. 16.4) implies that 
the A, and u, are mean rates, we can summarize these assumptions by the rate 
diagram shown in Fig. 16.4. The arrows in this diagram show the only possible 
transitions in the state of the system (as specified by Assumption 3), and the entry 
for each arrow gives the mean rate for that transition (as specified by Assumptions 
1 and 2) when the system is in the state at the base of the arrow. 

Except for a few special cases, analysis of the birth-and-death process is very 
difficult when the system is in a transient condition. Some results about the probability 
distribution of M(t)! have been obtained, but they are too complicated to be of much 
practical use. On the other hand, it is relatively straightforward to derive this distri- 
bution after the system has reached a steady-state condition (assuming that this con- 
dition can be reached). This derivation can be done directly from the rate diagram, 
as outlined next. 

Consider any particular state of the system n (n = 0,1, 2,.. .). Starting at 
time 0, suppose that a count is made of the number of times that the process enters 


! Karlin, S., and J. McGregor: ‘Many Server Queueing Processes with Poisson Input and Exponential 
Service Times,” Pacific Journal of Mathematics, 8:87-118, 1958. 
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this state and the number of times it leaves this state, as denoted below: 


E,(¢) = number of times that the process enters state n by time t. 
L,(t) = number of times that the process leaves state n by time t. 


Because the two types of events (entering and leaving) must alternate, these two 
numbers must always be either equal or differ by just 1; that is, 


IE) — L,()| = 1. 
Dividing through both sides by ż and then letting t — © gives 


Ex, i) EW) _ Eft) 
t t t t 


= 0. 








1 . 
s Fe so lim 


t-> 








Dividing E,(t) and L,(t) by t gives the actual rate (number of events per unit time) at 
which these two kinds of events have occurred, and letting t —> œ then gives the mean 
rate (expected number of events per unit time). 


. E, f 

lim a = mean rate at which the process enters state n. 
ts 

_ LO 

lim rae, = mean rate at which the process leaves state n. 
t-> 


These results yield the following key principle: 


RaTE IN = RATE Our PRINCIPLE: For any state of the system n (n = 
0, 1, 2, .. .), mean entering rate = mean leaving rate. 


The equation expressing this principle is called the balance equation for state 
n. After constructing the balance equations for all the states in terms of the unknown 
P, probabilities, we can solve this system of equations to find these probabilities. 

To illustrate a balance equation, consider state 0. The process enters this state 
only from state 1. Thus the steady-state probability of being in state 1 (P,) represents 
the proportion of time that it would be possible for the process to enter state 0. Given 
that the process is in state 1, the mean rate of entering state 0 is u. (In other words, 
for each cumulative unit of time that the process spends in state 1, the expected 
number of times that it would leave state 1 to enter state 0 is u.) From any other 
state, this mean rate is 0. Therefore, the overall mean rate at which the process leaves 
its current state to enter state 0 (the mean entering rate) is 


MP, + OC — Py) = uP. 


By the same reasoning, the mean leaving rate must be Aj Po, so the balance equation 
for state O is 


MP, = AoPo. 


For every other state there are two possible transitions both into and out of the 
state. Therefore, each side of the balance equations for these states represents the sum 
of the mean rates for the two transitions involved. Otherwise, the reasoning is just 
the same as for state 0. These balance equations are summarized in Table 16.1. 

Notice that the first balance equation contains two variables for which to solve 
(P, and P,), the first two equations contain three variables (Py, Pı, and P,), and so 


Table 16.1 Balance Equations for Birth-and-Death 
Process 


— 


Rate In = Rate Out 








State 
0 mP, = 
1 AoPo + uP, = 
2 AYP, + u3P3 = 
n-i A, -2Pn—2 wv HnPn 
n An-Pn-i + Mnene = 





AoPo 
A, + u )P, 
(A, + MP2 


= (Anzi + Ma DP aot 


On + Bn)P, 


; : 
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on, so that there always is one ‘‘extra’’ variable. Therefore, the procedure in solving 
these equations is to solve in terms of one of the variables, the most convenient one 
being Po. Thus the first equation is used to solve for P, in terms of Po; this result and 
the second equation are then used to solve for P, in terms of P), and so forth. At the 
end, the requirement that the sum of all the probabilities must equal 1 can be used to 
evaluate Po. 
Applying this procedure yields the following results: 


State 
À 
0: P = — P, 
Hi 
À 1 
1: P, = P, + — (mP, AGES) 
Mz M 
À 1 
2: P, = P, + — (mP, — àP) 
H3 M3 
Àn 1 
n— 1: Pa a = Pai + — (Ha-1Pn-1 ü An—2Pn-2) a 
Hn Hn 
Àn 
n: Piet = P, + (u,P,, B Àn- 1Pn-1) 
Bn+1 Bn+1 








To simplify notation, let 








C= 


n 


An—1An—2 °°" Ao 











Queueing Theory 

À ÀA 
=P = Po 
My Moy 
A p, AA Ào P 
Ms M3hoMy 
yo Posi = An-tAn-2 Me. Ào Po 

n Byby—1 st By 

À ÀA KERN 

n P, li E i a 0 Po 

Hn+1 Hn+1Hn ` ° My 





; forn 


HMnMn-1 °° * By 


Treti 
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Thus the steady-state probabilities are 





HI 
= 
P 


| P, = C,Po; for n 





The requirement that 





n=0 
implies that |: +> c,| eae 
n=1 
1 
so that Po = = à 
1+ Dei Cp 





Given this information, 





Also, because the number of servers s represents the number of customers that can 
be served (and thus are not in the queue) simultaneously, 











where A is the average arrival rate over the long run. Because A, is the mean arrival 
rate while the system is in state n (n = 0,1, 2,...), and P, is the proportion of 
time that the system is in this state, 





Several of the expressions just given involve summations with an infinite number 
of terms. Fortunately, these summations have analytic solutions for a number of 
interesting special cases,' as seen in the next section. Otherwise, they can be approx- 
imated by summing a finite number of terms on a computer. 

These steady-state results have been derived under the assumption that the A, 


1 These solutions are based on the following known results for the sum of any geometric series: 


N N+1 


1-x 


E for any x, 
n=0 1 xX 
Sys, if |x| < 1. 
n=O 1 Se 


and u, parameters have values such that the process actually can reach a steady-state 
condition. This assumption always holds if À, = 0 for some value of n greater than 
the initial state, so that only a finite number of states (those less than this n) are 
possible. It also always holds when A and u are defined (see ‘“Terminology and 
Notation’ in Sec. 16.2) and p = A/sp < 1. It does not hold if £7_, C, = ~. 

The following section describes several queueing models that are just special 
cases of the birth-and-death process. Therefore, the general steady-state results just 
given in boxes will be used over and over again to obtain the specific steady-state 
results for these models. 


16.6 Queueing Models Based on the Birth-and-Death Process 


Because each of the mean rates Aj, Àj, . . . and Hi, Mz, . . . for the birth-and-death 
process can be assigned any nonnegative value, we have great flexibility in modeling 
a queueing system. Probably the most widely used models in queueing theory are 
based directly upon this process. Because of Assumptions 1 and 2 (and Property 4 
for the exponential distribution), these models are said to have a Poisson input and 
exponential service times. The models differ only in their assumptions about how 
the A, and the py, change with n. We present four of these models in this section for 
four important types of queueing systems. 


The M/M/s Model 


As described in Sec. 16.2, the M/M/s model assumes that all interarrival times are 
independently and identically distributed according to an exponential distribution (i.e., 
the input process is Poisson), that all service times are independent and identically 
distributed according to another exponential distribution, and that the number of serv- 
ers is s (any positive integer). Consequently, this model is just the special case of the 
birth-and-death process where the queueing system’s mean arrival rate and mean 
service rate per busy server are constant (A and u, respectively) regardless of the 
state of the system. When the system has just a single server (s = 1), the implication 
is that the parameters for the birth-and-death process are A, = A (n = 0,1, 2,.. .) 
and u, = w(n = 1,2,.. .). The resulting rate diagram is shown in Fig. 16.5a. 


(a) Single-server case (s =1) A,=A, for n =O, 1,2,... 
En = p, forn =1,2,... 


A A A A À À 
sae: @) I @) @ e ES a Gy D 
K K hH e E u 


(b) Multiple-server case (s> 1) A,=A, forn=0,1,2,... 
“tn forn=1,2,...,5 
fn sp, forn=s,s4+1,... 


(s— Dp sp 
Figure 16.5 Rate diagrams for z M/M/s model. 
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However, when the system has multiple servers (s > 1), the mu, cannot. be 
expressed this simply. Keep in mind that u, represents the mean service rate for the 
overall queueing system (i.e., the mean rate at which service completions occur, so 
that customers leave the system) when there are n customers currently in the system. 
As mentioned for Property 4 of the exponential distribution (see Sec. 16.4), when the 
mean service rate per busy server is u, the overall mean service rate for n busy servers 
must be nu. Therefore, u, = nu when n= s, whereas u, = su when n = sp so 
that all s servers are busy. The rate diagratti for this case is shown in Fig. 16.5). 

When the maximum mean service rate (su) exceeds the mean arrival rate (A), 
that is, when 


p=—<i, 
SH 
a queueing system fitting this model will eventually reach a steady-state condition. In 
this situation, the steady-state results derived in Sec. 16:5 for the general birth-and- 
death process are directly applicable. However, these results simplify considerably 
for this model and yield closed-form expressions for the P,, L, L,, and so forth, as 
shown next. 


RESULTS FOR THE SINGLE-SERVER CASE (M/M/1): Fors = 1, the C, factors 
for the birth-and-death process reduce to 


aV 
C,={|-]}] =ø, foran = 1,2,.... 


H 
Therefore, P, = p"Po, forn = 1,2,..., 
where P E 
Ti = 
ae oe Bia1 P” 


Se) 
z 


= ] — p. 
Thus P, = (0 - pp", forn = 0,1,2,.... 


œ 


> nad - op” 


a= 


Consequently, L 


= g 
te mde n 
(1 — p)p 2 Ip ) 


d w 
=- pez p o) 


n=0 


ae 2(4) 
= LET ET, 


-P À 
See aN 





Similarly, L,= n- DP, 


n=1 


L= 1(1 — Py) 


il 


A2 
ule — A) 

When à = p, so that the mean arrival rate exceeds the mean service rate, the 
preceding solution ‘‘blows up’’ (because the summation for computing Po diverges). 
For this case, the queue would ‘‘explode’’ and grow without bound. 

Assuming again that A < u, we now can derive the probability distribution of 
the waiting time in the system (including service) W for a random arrival when the 
queue discipline is first-come-first-served. If this arrival finds n customers already in 
the system, he will have to wait through (n + 1) exponential service times, including 
his own. (For the customer currently being served, recall the lack of memory property 
for the exponential distribution discussed in Sec. 16.4.) Therefore, let T}; T,, . . . be 
independent service-time random variables having an exponential distribution with 
parameter u, and let 


Seep = Lyk Poh te be Des forn = 0,1,2,..., 


so that S„,; represents the conditional waiting time given n customers already in the 
system. As discussed in Sec. 16.7, S,,, is known to have an Erlang distribution.! 
Because the probability that the random arrival will find n customers in the system is 
P.,, it follows that 


PW >} = > P PiS, >A, 
n=0 


which reduces after considerable algebraic manipulation to 
P{W > = eTo, for t= 0. 
The surprising conclusion is that W has an exponential distribution with parameter 
“1 — p). Therefore, 
1 
al — p) 
i 
Baw 


W= EW) 





These results include service time in the waiting time. In some contexts (e.g., 
the County Hospital emergency room problem), the more relevant waiting time is just 
until service begins. Thus consider the waiting time in the queue (so excluding service 
time) Wg for a random arrival when the queue discipline is first-come-first-served. If 
this arrival finds no customers already in the system, he is served immediately, so 


l Outside of queueing theory, this distribution is known as the gamma distribution. 
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that 
PW, = O} = Py = 1- p. 


If he finds n > 0 customers already there instead, then he has to wait through n 
exponential service times until his own service begins, so that 


Re PAS, > tt 


Il 
iMs 


PEW > t 


>= A- OPPS, > ġ 


n= 


t 


ll 


P > PPIS i +1 > th 


pP{W >t} 
= pe~ Mle, for t = 0. 


By deriving the mean of this distribution [or applying either L, = AW, or W, = 
W -= (t/m), 


RESULTS FOR THE MULTIPLE-SERVER CASE (s > 1):. When s > 1, the C, factors 
become 


(A/ W" 
n! 





C = 


a È forn = 1,2,...,5, 


and C= 


à foran = s,s + 1,.... 


Aw ( a n _ Alp)" 
Se 


s! gl sre? 


Consequently, if A < su, then 











_ es A/a = fay 
P= : eaa | 
_ P ue A/a 1 | 
s! 1 — (A/sp) 
aint Pos ifOsanss 
and P, = Mt n 
wer Po, ifn=s. 


Using the notation p = A/ps, 
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=P 
o gl P Tp l-p 
_ PAA/Wp 
sa — p? 
L 
q, 
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Figures 16.6 and 16.7 show how P, and L change with p for various values 
of s. 

The single-server method for finding the probability distribution of waiting times 
also can be extended to the multiple-server case. This yields! (for t = 0) 


PÀ w (: = a 





PEW > th = e™ + 


si — p) s — 1 — à/H 
and PW, > t} = [1 _ PIW = Oye TUTA, 
sol 
where PEW, = 0} = 5 P.. 
n=0 


If A = sp, so that the mean arrival rate exceeds the maximum mean service 
rate, the queue grows without bound, so the preceding steady-state solutions are not 
applicable. 


EXAMPLE: For the County Hospital emergency room problem (see Sec. 16.1), the 
Management Engineer has concluded that the emergency cases arrive pretty much at 


"When s — 1 — A/uw = 0,(1 — e #~1~A/m) (5 — 1 — A/p) should be replaced by pt. 
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Figure 16.6 Values of Po for the M/M/s model (Sec. 16.6). 


random (a Poisson input process), so that interarrival times have an exponential dis- 
tribution. He also has concluded that the time spent by a doctor treating the cases 
approximately follows an exponential distribution. Therefore, he has chosen the 


M/M/s model for a preliminary study of this queueing system. 


By projecting the available data for the early evening shift into next year, he 
estimates that patients will arrive at an average rate of one every half hour. A doctor 
requires an average of 20 minutes to treat each patient. Thus, using an hour as the 


unit of time, 


1 1 
oe hour per customer 
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1 
= 3 hour per customer, 
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Figure 16.7 Values of L for the M/M/s model (Sec. 16.6). 


so that A = 2 customers per hour 


u = 3 customers per hour. 


The two alternatives being considered are to continue having just one doctor during 
this shift (s = 1) or to add a second doctor (s = 2). In both cases, 


À 
p= — <l, 
SH 


so that the system should approach a steady-state condition. (Actually, because A is 
somewhat different during other shifts, the system will never truly reach a steady- 
state condition, but the Management Engineer feels that steady-state results will pro- 
vide a good approximation.) Therefore, the preceding equations are used to obtain 
the results shown in Table 16.2. 

On the basis of these results, he tentatively concluded that a single doctor would 
be inadequate next year for providing the relatively prompt treatment needed in a 
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Table 16.2 Steady-State Results from 
M/M/s Model for County Hospital 














Problem 
p 3 3 
Po 3 2 
P, 3 3 
P, frn=2 | 4G)" Gy" 
L, $ 72 
L 2 3 
W, $ = (in hours) 
Ww 1 3 Gin hours) 
P{W, >0} | 0.667 0.167 
Pw, > 4 | 0.404 0.022 
Pw, >} | 0.245 0.003 
P{W, > qe ze 
PEW > th et ge 343 — e`’) 
a cel aaa oe 


hospital emergency room. You will see later how the Management Engineer checked 
this conclusion by applying two other queueing models that provide better representa- 
tions of the real queueing system in some ways. 


The Finite Queue Variation of the M/M/s Model 


We mentioned in the discussion of queues in Sec. 16.2 that queueing systems some- 
times have a finite queue; i.e., the number of customers in the system is not permitted 
to exceed some specified number (denoted by K). Any customer that arrives while 
the queue is ‘‘full’’ is refused entry into the system and so leaves forever. From the 
viewpoint of the birth-and-death process, the mean input rate into the system becomes 
zero at these times. Therefore, the one modification needed in the M/M/s model to 
introduce a finite queue is to change the A,, parameters to 


_ A, forn = 0,1,2,...,K—-—1 
A= 0, forn = K. 


Because A, = 0 for some values of n, a queueing system that fits this model will 
eventually reach a steady-state condition. 

This model commonly is labeled as M/M/s/K, where the presence of the fourth 
symbol distinguishes it from the M/M/s model. The single difference in the formu- 
lation of these two models is that K is finite for the M/M/s/K model and K = © for 
the M/M/s model. 

The usual physical interpretation for the M/M/s/K model is that there is only 
limited waiting room that will accommodate a maximum of K customers in the system. 
For example, for the County Hospital emergency room problem, this system actually 
would have a finite queue if there were only K cots for the patients and if the policy 
were to send arriving patients to another hospital whenever there were no empty cots. 

Another possible interpretation is that arriving customers will leave and ‘‘take 
their business elsewhere’’ whenever they find too many customers (K) ahead of them 
in the system because they are not willing to incur a long wait. This balking phenom- 


enon is quite common in commercial service systems. However, there are other models 619 
available (e.g., see Prob. 5) that fit this interpretation even better. 

The rate diagram for this model is identical to that shown in Fig. 16.5 for the 
M/M/s model, except that it stops with state K. 
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RESULTS FOR THE SINGLE-SERVER CASE (M/M/1/K): For this case, 
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p (K + 1p**! 
a i- p C= p*t} : 














As usual (when s = 1), 
Lee be Oh = Po). 


Notice that the preceding results do not require that A < yp (i.e., that p < 1). 

When p < 1, it can be verified that the second term in the final expression for 
L converges to zero as K — ©, so that all of the preceding results do indeed converge 
to the corresponding results given earlier for the M/M/1 model. 


! If p = 1, then P, = 1/(K + 1) forn = 0,1,2,..., K, so that L = K/2. 
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The waiting-time distributions can be derived by using the same reasoning as 
for the M/M/1 model (see Prob. 16). However, no: simple expressions are obtained 
in this case, so computer calculations are required: Fortunately, even though L # AW 
and L} # AW, for the current model because the À, are not equal for all n (see the 
end of Sec. 16.2), the expected waiting times for customers entering the system still 
can be obtained directly from the expressions given at the end of Sec. 16.5, 


L 
W==, W= 
À 


where à= D Pn 
n=0 


= A(l — Pp). 


RESULTS FOR THE MULTIPLE-SERVER CASE (s > 1): Because this model does 
not allow more than K customers in the system, K is the maximum number of servers 
that could ever be used. Therefore, assume that s = K. In this case, C,, becomes 














À n 
ga forn =1,2,...,8 
At 
Amy AN T _ lw" 
ga iT frn =s,s+1,...,K, 
s! Sp SIS 
= 0, for n > K. 
À n 
U een eee ae 
F n! 
Hence P, = ¢(A/p)" 
AA »,, forn = s,s +1,..., K 
sis" s 
0, for n > K, 
S A/D Alor = Hai 
where P=1 S AS a, 
n=1 n! s! n=s+1 \SM 
Adapting the derivation of L, for the M/M/s model to this case (see Prob. 22) yields 
Po(A/1)"p ze z 
= Sy n K-s A K= K (J — : 
a ae oy p ( sp" *( p)] 


where p = A/sp.' It can then be shown (see Prob. 33) that 


s-l sal 
L= 3 aP, + ts (1- | 


n=0 n=0 


W and W, are obtained from these quantities just as shown for the single-server case. 


1 If p = 1, it is necessary to apply L’Hépital’s rule twice to this expression for L,. Otherwise, all of these 
multiple-server results hold for all p > 0. The reason that this queueing system can reach a steady-state 
condition even when p = 1 is that A, = 0 for n = K, so that the number of customers in the system cannot 
continue growing indefinitely. 


The Finite Calling Population Variation of the M/M/s Model 


Now assume that the only deviation from the M/M/s model is that (as defined in 
Sec. 16.2) the input source is limited; i.e., the size of the calling population is finite. 
For this case, let N denote the size of the calling population. Thus, when the number 
of customers in the queueing system is n (n = 0,1, 2,...,N), there are only 
(N — n) potential customers remaining in the input source. 

The most important application of this model has been to the machine repair 
problem, where one or more repairmen are assigned the responsibility of maintaining 
in operational order a certain group of N machines by repairing each one that breaks 
down. The repairmen are considered to be individual servers in the queueing system 
if they work individually on different machines, whereas the entire crew is considered 
to be a single server if they work together on each machine. The machines constitute 
the calling population. Each one is considered to be a customer in the queueing system 
when it is down waiting to be repaired, whereas it is outside the queueing system 
while it is operational. 

Note that each member of the calling population alternates between being inside 
and outside the queueing system. Therefore, the analog of the M/M/s model that fits 
this situation assumes that each member’s outside time (i.e., the elapsed time from 
leaving the system until returning for the next time) has an exponential distribution 
with parameter A. When n of the members are inside, and so (N — n) members are 
outside, the current probability distribution of the remaining time until the next arrival 
to the queueing system is the distribution of the minimum of the remaining outside 
times for the latter (V — n) members. Properties 2 and 3 for the exponential distri- 
bution imply that this distribution must be exponential with parameter A, = 
(N — n)A. Hence this model is just the special case of the birth-and-death process 
that has the rate diagram shown in Fig. 16.8. 

Because A, = 0 for n = N, any queueing system that fits this model will 
eventually reach a steady-state condition. The available steady-state results are sum- 


(a) Single-server case (s =1) An =£ — mA, forn=0,1,2,...,N 


0, forn>N 
=p, forn = 1,2,... 
NA an se (N—n+2)A (N ~—n+I])A 


(b) Multiple-server case (s>1) An Pa ay, forn=0,1,2,....N 


0, forn>N 
_ frp, for n = 1,2,...,s 
re lee forn =s,s + 1,... 
NÀ an (N—s+2)A (N—~s+1)A 
(s— Dp 


Figure 16.8 Rate diagrams for the finite calling population variation of the M/M/s model. 
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Prohabilisc Models RESULTS FOR THE SINGLE-SERVER CASE (s = 1): When s = 1, the C, factors 
in Sec. 16.5 reduce to 
aV N! aV 
C5 NN -1)- N-n+t+t Dij == IT). frn = N 
4 W — n)! w 


= 0, forn >N, 


for this model. Therefore, 


y N! aV 
a ea 
N! ý 


À 
P, 


=o (4) Po ifn=1,;2,...,N 


N 
L, = > (n — LP, 


which can be reduced to 


At+p 
Leon NS ~u = Po); 
N 
L => nP, = L, + (1 — P) 
n=0 
H 
aW USR 
L Lı 
Finally, Wee wee 
inally ; a= 5 
ia a N 
where A= > A,P, = Dd WN- AP, = AN - D). 
n=0 n=0 
RESULTS FOR THE MULTIPLE-SERVER CASE (s > 1): ForN2=s> 1, 
N! aV 
aon (4). forn =1,2,...,8 
N! aV 
C, = sr (4). forn = s,s + 1,..., N 
0, forn >N. 
N! a\" 
A Po, ifOsnss 
N! a\" . 
Hence P, = aos (4) Po, ifssn=sN 


0, ifn >N, 





sol á N n 
MA N! A 
wei: a Te (N = nin! (2) t 2 WwW nisis™ (4) | 


N 
Finally, L= È n- sP, 


Il 


y= s—=1 
L D nP, + L} + (i - » P); 
which then yield W and W, by the same equations as in the single-server case. 

Extensive tables of computational results are available! for this model for both 
the single-server and multiple-server cases. 

For both cases, it has been shown? that the preceding formulas for P, and Py 
(and so for Lyp L, W, and W,) also hold for a generalization of this model. In particular, 
we can drop the assumption that the times spent outside the queueing system by the 
members of the calling population have an exponential distribution, even though this 


takes the model outside the realm of the birth-and-death process. As long as these: 


times are identically distributed with mean 1/A (and the assumption of exponential 
service times still holds), these outside times can have any probability distribution! 


A Model with State-Dependent Service Rate and/or Arrival Rate 


All the models thus far have assumed that the mean service rate is always a constant, 
regardless of how many customers are in the system. Unfortunately, this rate often is 
not a constant in real queueing systems, particularly when the servers are people. 
When there is a large backlog of work (i.e., a long queue), it is quite likely that such 
servers will tend to work faster than they do when the backlog is small or nonexistent. 
This increase in the service rate may result merely because the servers increase their 
effort when they are under the pressure of a long queue. However, it may also result 
partly because the quality of the service is compromised or because assistance is 
obtained on certain service phases. 

Given that the mean service rate does increase as the queue size increases, it 
would be desirable to develop a theoretical model that seems to describe the pattern 
by which it increases. This model not only should be a reasonable approximation of 
the actual pattern but also should be simple enough to be practical for implementation. 
One such model is formulated next. (You have the flexibility to formulate many similar 
models within the framework of the birth-and-death process.) We then show how the 
same results apply when the arrival rate is affected by the queue size in an analogous 
way. 


FORMULATION FOR THE SINGLE-SERVER CASE (s = 1): Let 
Un = nhs forn = 1,2,..., 


where n = number of customers in system, 
H, = Mean service rate when there are n customers in system, 


' Peck, L. G., and R. N. Hazelwood: Finite Queueing Tables, Wiley, New York, 1958. 


? Bunday, B. D., and R. E. Scraton: ‘‘The G/M/r Machine Interference Model,’’ European Journal of 
Operational Research, 4:399-402, 1980. 
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1/4, = expected ‘‘normal’’ service time—expected time to service customer 
when that customer is the only one in system, 
c = ‘pressure coefficient’’—positive constant that indicates degree to 


which service rate of system is affected by system state. 


Thus, selecting c = 1, for example, hypothesizes that the mean service rate is directly 
proportional to n; c = % implies that the mean service rate is proportional to the 
square root of n; and so on. The preceding queueing models in this section have 
implicitly assumed that c = 0. 

Now assume additionally that the queueing system has a Poisson input with 


A, = A (for n = 0,1, 2,.. .) and exponential service times with mu, as just given. 
This case is now a special case of the birth-and-death process, where 
À n 
Ge E N 
(n!) 


Thus all the steady-state results given in Sec. 16.5 are applicable to this model. (A 
steady-state condition always can be reached when c > 0.) Unfortunately, analytical 
expressions are not available for the summations involved. However, nearly exact 
values of Py and L have been tabulated’ for various values of c and A/, by summing 
a finite number of terms on a computer. A small portion of these results also is shown 
in Figs. 16.9 and 16.10. 

A queueing system may react to a long queue by decreasing the arrival rate 
instead of increasing the service rate. (The arrival rate may be decreased, for example, 
by diverting some of the customers requiring service to another service facility.) The 
corresponding model for describing mean arrival rates for this case is to let 


dn = (n + 1)7?Ao, forn = 0,1,2,..., 


where b is a constant whose. interpretation is analogous to that for c. The C, values 
for the birth-and-death process with these A, (and with uw, = p forn = 1,2,...) 
are identical to those just shown (replacing. A by Ào) for the state-dependent service 
rate model when c = b and À/m = A,/p, so the steady-state results also are the 
same. 

A more general model that combines these two patterns can also be used when 
both the mean arrival rates and the mean service rates are state-dependent. Thus let 


Hn = Nn? py, forn=1,2,..., 
hyn = (n + 1)7?Ag, forn =0,1,2,.... 


Once again, the C,, values for the birth-and-death process with these parameters are 
identical to those shown for the state-dependent service rate model when c = a + b 
and A/j, = Ap/p4, so the tabulated steady-state results actually are applicable to 
this general model. 


FORMULATION FOR THE MULTIPLE-SERVER CASE (s > 1): ‘To generalize this 
combined model further to the multiple-server case, it would seem natural to have the 
Hn and A, vary with the number of customers per server (n/s) in essentially the same 


1 Conway, Richard W., and William L. Maxwell: ‘“‘A Queueing Model with State Dependent Service 
Rate,” Journal of Industrial Engineering, 12:132-136, 1961. 
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Figure 16.10 Values of L for the state-dependent model (Sec. 16.6). 


way they vary with n for the single-server case. Thus let 


ny, ifn=s 
Hn = n ‘ : 
=] spy, ifn =s, 
s 
Nos ifnss— 1 
hy = aN l 
TETE Ào in=s — 1. 
Therefore, the birth-and-death process with these parameters has 
À n 
o/u" forn =1,2,...,58 
n! 
C, = R 
(Ào/ m) 


RAVATA form = s,s + 1..., 


where c = a + b. 


Computational results for Pp, L}, and L have been tabulated! for various values 
of c, (Ag /p), and s. Some of these results also are given in Figs. 16.9 and 16.10. 


EXAMPLE: After gathering additional data for the County Hospital emergency room, 
the Management Engineer found that the time a doctor spends with a patient tends to 
decrease as the number of patients waiting increases. Part of the explanation is simply 
that the doctor works faster, but the main reason is that more of the treatment is turned 
over to a nurse for completion. The pattern of the u, (the mean rate at which a doctor 
treats patients while there are a total of n patients to be treated in the emergency 
room) seems to fit reasonably the state-dependent service rate model presented here. 
Therefore, the Management Engineer has decided to apply this model. 

The new data indicate that the average time a doctor spends treating a patient 
is 24 minutes if no other patients are waiting, whereas this average becomes 12 minutes 
when each doctor has six patients (so five are waiting their turn). Thus, with a single 
doctor on duty, 


Ki = 23 customers per hour, 
He 


ll 


5 customers per hour. 


Therefore, the pressure coefficient c (or a in the general model) must satisfy the 
relationship 


Me = 6° u, so 6° = 2. 


Using logarithms to solve for c yields c = 0.4. Because A = 2 from before, this 
solution for c completes the specification of parameter values for this model. 

To compare the two alternatives of having one doctor (s = 1) or two doctors 
(s = 2) on duty, the Management Engineer developed the various measures of per- 
formance shown in Table 16.3. The values of P), L, and (for s = 2) L, were obtained 
directly from the tabulated results for this model. (Except for this L,, you can ap- 
proximate the same values from Figs. 16.9 and 16.10.) These values were then used 
to calculate 


P, = C,Pp, 
L,=L-(1-P), ifs =1, 


[L = L- P)- 21 ~ P-P), ifs = 2] 


L ae L 
ae 


s=] 


PW, >0}=1- > Pe 
n=0 


Wa = 5: 


The fact that some of the results in Table 16.3 do not deviate substantially from 
those in Table 16.2 reinforce the tentative conclusion that a single doctor will be 
inadequate next year. 


1 Hillier, F. S., R. W. Conway, and W. L. Maxwell: “A Multiple Server Queueing Model with State 
Dependent Service Rate,’’ Journal of Industrial Engineering, 15:153-157, 1964. 
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Table 16.3 Steady-State Results from 
State-Dependent Service Rate Model 
for County Hospital Problem 





s=] |s=2 
À 
ES 0.8 0.4 
Soy, 
À 
iat 0.4 0.2 
Ses 
Po 0.367 | 0.440 
P, 0.294 | 0.352 
L 0.618 | 0.095 
L 1.251 | 0.864 
WwW 0.309 0.048 (in hours) 
W 0.626 | 0.432 (in hours) 
P{W, > 0} 0.633 | 0.208 











16.7 Queueing Models Involving Nonexponential 
Distributions 


Because all the queueing theory models in the preceding section (except for one 
generalization) are based on the birth-and-death process, both their interarrival and 
service times are required to have exponential distributions. As discussed in Sec. 
16.4, this type of probability distribution has many convenient properties for queueing 
theory, but it provides a reasonable fit for only certain kinds of queueing systems. In 
particular, the assumption of exponential interarrival times implies that arrivals occur 
randomly (a Poisson input process), which is a reasonable approximation in many 
situations but not when the arrivals are carefully scheduled or regulated. Furthermore, 
the actual service-time distribution frequently deviates greatly from the exponential 
form, particularly when the service requirements of the customers are quite similar. 
Therefore, it is important to have available other queueing models that use alternative 
distributions. 

Unfortunately, the mathematical analysis of queueing models with nonexponen- 
tial distributions is much more difficult. However, it has been possible to obtain some 
useful results for a few such models. This analysis is beyond the level of this book, 
but in this section we shali summarize the models and describe their results. 


The M/G/1 Model 


As introduced in Sec. 16.2, the M/G/1 model assumes that the queueing system has 
a single server and a Poisson input process (exponential interarrival times) with a 
fixed mean arrival rate À. As usual, it is assumed that the customers have independent 
service times with the same probability distribution. However, no restrictions are 
imposed on what this service-time distribution can be. In fact, it is only necessary to 
know (or estimate) the mean 1/u and variance g? of this distribution. 

Any such queueing system can eventually reach a steady-state condition if p = 
A/p < 1. The readily available steady-state results! for this general model are the 
following: 


1 A recursion formula also is available for calculating the probability distribution of the number of customers 
in the system; see Hordijk, A., and H. C. Tijms: Statistica Neerlandica, 30:97-100, 1976. 


Py = 1- p, 
A2 2 4 2 
L =, 
2(1 — p) 
L=p+ Lyp 
L 
q 
Wea 
1 
W=W + -. 
H 


Considering the complexity involved in analyzing a model that permits any service- 
time distribution, it is remarkable that such a simple formula can be obtained for Ly 
This formula is one of the most important results in queueing theory because of its 
ease of use and the prevalence of M/G/1 queueing systems in practice. This equation 
for L, (or its counterpart for W,) commonly is referred to as the Pollaczek-Khintchine 
formula, named after the Frenchman and the Russian who derived it more than 50 
years ago. 

For any fixed expected service time 1/, notice that L,, L, Wp and W all 
increase as go is increased. This result is important because it indicates that the 
consistency of the server has a major bearing on the performance of the service 
facility —not just his average speed. This key point is illustrated in the next subsection. 

When the service-time distribution is exponential, o° = 1/7, and the preceding 
results will reduce to the corresponding results for the M/M/1 model given at the 
beginning of Sec. 16.6. 

The complete flexibility in the service-time distribution provided by this model 
is extremely useful, so it is unfortunate that efforts to derive similar results for the 
multiple-server case have been unsuccessful. However, some multiple-server results 
have been obtained for the important special cases described by the following two 
models. 


The M/D/s Model 


When the service consists of essentially the same routine task to be performed for all 
customers, there tends to be little variation in the service time required. The M/D/s 
model often provides a reasonable representation for this kind of situation, because it 
assumes that all service times actually equal some fixed constant (the degenerate 
service-time distribution) and that we have a Poisson input process with a fixed mean 
arrival rate A. 

When there is just a single server, the M/D/1 model is just the special case of 
the M/G/1 model where o? = 0, so that the Pollaczek-Khintchine formula reduces 
to 


p? 


L, = ——, 
1 X -p 


where L, W,, and W are obtained from L, as just shown. Notice that this Li and W, 
are exactly half as large as for the exponential service-time case of Sec. 16.6 (the 
M/M/1 model), where o° = 1/p’, so decreasing o° can greatly improve the 
measures of performance of a queueing system. 
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Figure 16.11 Values of L for the M/D/s model (Sec. 16.7). 
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For the multiple-server version of this model (M/D/s), a complicated method 
is available! for deriving the-steady-state probability distribution of the number of 
customers in the system and its mean (assuming p = A/s < 1). However, these 
tesults have been tabulated for numerous cases,” and the means (L) also are given 
graphically in Fig. 16.11. 


The M/E,,/s Model 


The M/D/s model assumes zero variation in the service times (o = 0), whereas the 
exponential service-time distribution assumes a very large variation (a = 1/4). Be- 
tween these two rather extreme cases lies a long middle ground (0 < ao < 1/p), 
where most actual service-time distributions fall. Another kind of theoretical service- 
time distribution that fills this middle ground is the Erlang distribution (named after 
the founder of queueing theory). 

1 See Prabhu, N:U.: Queues and Inventories, pp. 32-34, Wiley, New York, 1965; also see pp. 344-346 
in Selected Reference 3. 


2 Hillier, F. S., and O. S. Yu, with D. Avis, L. Fossett, F. Lo, and M. Reiman: Queueing Tables and 
Graphs, Elsevier North-Holland, New York, 1981. 


The probability density function for the Erlang distribution is 


(uk) k-i, =k 
D = —— tT le ™, for t = 0, 
FO) E-D! 
where u and k are strictly positive parameters of the distribution, and k is further 
restricted to be integer. (Except for this integer restriction, and the definition of the 
parameters, this distribution is identical to the gamma distribution.) Its mean and 
standard deviation are 


1 

Mean = -, 

H 
Standard deviati ie 
tandard deviation = —=—. 
Vk 


Thus k is the parameter that specifies the degree of variability of the service times 
relative to the mean. It usually is referred to as the shape parameter. 

The Erlang distribution is a very important distribution in queueing theory for 
two reasons. To describe the first one, suppose that T,, T,,.. . , Tọ are k independent 
random variables with an identical exponential distribution whose mean is 1/ku. Then 
their sum, 


=7,+7,+-°°--:+T,, 


has an Erlang distribution with parameters u and k. The discussion of the exponential 
distribution in Sec. 16.4 suggested that the time required to perform certain kinds of 
tasks might well have an exponential distribution. However, the total service required 
by a customer may involve the server performing not just one specific task but a 
sequence of k tasks. If the respective tasks have an identical exponential distribution 
for their duration, the total service time would have an Erlang distribution, which 
would be the case, for example, if the server must perform the same exponential task 
k times for each customer. 

The Erlang distribution also is very useful because it is a large (two-parameter) 
family of distributions permitting only nonnegative values. Hence empirical service- 
time distributions can usually be reasonably approximated by an Erlang distribution. 
In fact, both the exponential and the degenerate (constant) distributions are special 
cases of the Erlang distribution, with k = 1 and k = œ, respectively. Intermediate 
values of k provide intermediate distributions with mean = 1/u, mode = 
(k — 1)/kp, and variance = 1/kp?, as suggested by Fig. 16.12. 

Now consider the M/E,/1 model, which is just the special case of the M/G/1 
model where service times have an Erlang distribution with shape parameter = k. 
Applying the Pollaczek-Khintchine formula with o? = 1/kp? (and the accompanying 
results given for M/G/1) yields 


Mike +p? 1Ll+k X 








f 2(1 — p) 2k (ie NY 
palt A 
1 2k un = ay’ 
1 
W=W,+-, 
u 


L = dW. 
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Figure 16.12 A family of Erlang distributions with constant mean 1/p. 


With multiple servers (M/E, /s), the relationship of the Erlang distribution to 
the exponential distribution just described can be exploited to formulate a modified 
birth-and-death process (continuous time Markov chain) in terms of individual ex- 
ponential service phases (k per customer) rather than complete customers. However, 
it has not been possible to derive a general steady-state solution (when p = A/s < 
1) for the probability distribution of the number of customers in the system as we did 
in Sec. 16.5. Instead, advanced theory is required to solve individual cases numeri- 
cally. Once again, these results have been obtained and tabulated for numerous cases.! 
The means (L) also are given graphically in Fig. 16.13 for some cases where s = 2. 


Models without a Poisson Input 


All the queueing models presented thus far have assumed a Poisson input process 
(exponential interarrival times). However, this assumption is violated if the arrivals 
are scheduled or regulated in some way that prevents them from occurring randomly, 
in which case another model is needed. 

As long as the service times have an exponential distribution with a fixed pa- 
rameter, three such models are readily available. These models are obtained by merely 
reversing the assumed distributions of the interarrival and service times in the pre- 
ceding three models. Thus the first new model (GI/M/s) imposes no restriction on 
what the interarrival time distribution can be. In this case, there are some steady- 
state results available*(particularly in regard to waiting-time distributions) for both 
the single-server and multiple-server versions of the model, but these results are not 
nearly as convenient as the simple expressions given for the M/G/1 model. The 
second new model (D/M/s) assumes that all interarrival times equal some fixed 
constant, which would represent a queueing system where arrivals are scheduled at 
regular intervals. The third new model (E,,/M/s) assumes an Erlang interarrival time 
distribution, which provides a middle ground between regularly scheduled (constant) 
and completely random (exponential) arrivals. Extensive computational results have 


1 Tid. 
2 For example, see pp. 304-320 of Selected Reference 3. 
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Figure 16.13 Values of L for the M/E,/2 model (Sec. 16.7). 


E 


been tabulated’ for these latter two models, including the values of L given graphically 
in Figs. 16.14 and 16.15. 

Tf neither the interarrival times nor the service times for a queueing system have 
an exponential distribution, then there are three additional queueing models for which 
computational results also are available.” One of these models (E„/E,/s) assumes an 
Erlang distribution for both these times. The other two models (E,/D/s and D/E,/s) 
assume that one of these times has an Erlang distribution and the other time equals 
some fixed constant. 


Other Models 


Although you have seen in this section a large number of queueing models that involve 
nonexponential distributions, we have far from exhausted the list. For example, an- 
other distribution that occasionally is used for either interarrival times or service times 


! F, S. Hillier and O. S. Yu, op. cit. 
2 Ibid. 
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Figure 16.14 Values of L for the D/M/s model (Sec. 16.7). 


is the hyperexponential distribution. The key characteristic of this distribution is 
that, even though only nonnegative values are allowed, its standard deviation, o, 
actually is larger than its mean, 1/. This characteristic is in contrast to the Erlang 
distribution, where o < 1/p in every case except k = 1 (exponential distribution), 
which has øe = 1/p. To illustrate a typical situation where o > 1/ can occur, we 
suppose that the service involved in the queueing system is the repair of some kind 
of machine or vehicle. If many of the repairs turn out to be routine (small service 
times) but occasional repairs require an extensive overhaul (very large service times), 
then the standard deviation of service times will tend to be quite large relative to the 
mean, in which case the hyperexponential distribution may be used to represent the 
service-time distribution. 

Another family of distributions coming into general use consists of phase-type 
distributions (some of which also are called generalized Erlangian distributions). 
These distributions are obtained by breaking down the ‘total time into a number of 
phases, each having an exponential distribution, where the parameters of these ex- 
ponential distributions may be different and the phases may be either in series or in 
parallel (or both). A group of phases being in parallel means that the process randomly 
selects one of the phases to go through each time according to specified probabilities. 
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Figure 16.15 Values of L for the E,/M/2 model (Sec. 16.7). 


This approach is, in fact, how the hyperexponential distribution is derived, so this 
distribution is a special case of the phase-type distributions. Another special case is 
the Erlang distribution, which has the restrictions that all of its k phases are in series 
and that these phases have the same parameter for their exponential distributions. 
Removing these restrictions means that phase-type distributions in general can provide 
considerably more flexibility than the Erlang distribution in fitting the actual distri- 
bution of interarrival times or service times observed in a real queueing system. This 
flexibility is especially valuable when using the actual distribution directly in the model 
is not analytically tractable, whereas the ratio of the mean to the standard deviation 
for the actual distribution does not closely match the available ratios (Vk fork = 1, 
2, . . .) for the Erlang distribution. 

Because they are built up from combinations of exponential distributions, 
queueing models using phase-type distributions still can be represented by a continuous 
time Markov chain. This Markov chain generally will have an infinite number of 
states, so solving for the steady-state distribution of the state of the system requires 
solving an infinite system of linear equations that has a relatively complicated struc- 
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ture. Solving such a system is far from a routine thing, but recent theoretical advances 
have enabled us to solve these queueing models numerically in some cases. An ex- 
tensive tabulation of these results for models with various phase-type distributions 
(including the hypergeometric distribution) now is available.! 


16.8 A Priority-Discipline Queueing Model 


Priority-discipline queueing models are those where the queue discipline is based on 
a priority system. Thus the order in which members of the queue are selected for 
service is on the basis of their assigned priorities. 

Many real queueing systems fit these priority-discipline models much more 
closely than other available models. Rush jobs are taken ahead of other jobs, and 
important customers may be given precedence over others. Therefore, the use of 
priority-discipline models often provides a very welcome refinement over the more 
usual queueing models. 

Unfortunately, the inclusion of priorities makes the mathematical analysis fairly 
complicated, so that only limited results are available. Almost all these results are for 
the single-server case. However, usable results are available for one multiple-server 
model. This model assumes that there are N priority classes (class 1 has the highest 
priority and class N the lowest) and that, whenever a server becomes free to begin 
serving a new customer from the queue, the one selected is that member of the highest 
priority class represented in the queue who has waited longest. In other words, cus- 
tomers are selected to begin service in the order of their priority classes, but on a 
first-come-first-served basis within each priority class. A Poisson input process and 
exponential service times are assumed for each priority class. The model also makes 
the somewhat restrictive assumption that the mean service time is the same for all 
priority classes. However, it does permit the mean arrival rate to differ among the 
priority classes. 

Notice that if this distinction between customers in different priority classes is 
ignored, this model actually fits the M/M/s model studied in Sec. 16.6. Therefore, 
when we count just the total number of customers in the system, the steady-state 
distribution given there still applies here. Consequently, the formulas for L and L, 
also carry over, as do the expected waiting-time results (by Little’s formula), W and 
W,, for a randomly selected customer. What changes is the distribution of waiting 
times, which was derived in Sec. 16.6. under the assumption of a first-come-first- 
served queue discipline. With a priority discipline, this distribution has a much larger 
variance, because the waiting times of customers in the highest priority classes tend 
to be much smaller than under a first-come-first-served discipline, whereas the waiting 
times in the lowest priority classes tend to be much larger. By the same token, the 
breakdown of the total number of customers in the system tends to be disproportion- 
ately weighted toward the lower priority classes. But this condition is just the reason 
for imposing priorities on the queueing system in the first place. We want to improve 
the measures of performance for each of the higher priority classes. at the expense of 
performance for the lower priority classes. To determine how much improvement is 
being made, we need to obtain such measures as expected waiting time in the system 


i Seelen, L. P., H. C. Tijms, and M. H. Van Hoorn: Tables for Multi-Server Queues, North-Holland, 
Amsterdam, 1985. 


and expected number of customers in the system for the individual priority classes. 
We obtain these measures next for this priority-discipline model. 

First consider the case of nonpreemptive priorities, where a customer being 
served cannot be ejected back into the queue (preempted) if a higher priority customer 
enters the queueing system. Therefore, once a server has begun serving a customer, 
the service must be completed without interruption. Under this assumption for the 
priority-discipline model, W,, the steady-state expected waiting time in the system 
(including service time) for a member of priority class k, is 


1 1 
= ——— +, fork = 1,2,...,N, 
A-B- Be M 


— ANSE 
where aa (2 JE Zsu 


W, 





F j=0 J: 
B, = 1, 
Die A 
B,=1- g fork = 1,2,...,N, 
Sp 
and s = number of servers, 


4 = mean service rate per busy server, 


A; = mean arrival rate for priority class i, fori = 1,2,...,.N, 
N 
hs S Ay 
i=l 
À 
r=. 
H 


(These results assume that 


k 

> A; < Sp, 

i=] 
so that priority class k can reach a steady-state condition.) Little’s formula still applies 
to individual priority classes, so L,, the steady-state expected number of members of 
priority class k in the queueing system (including those being served), is 


Le = AW, fork =1,2,...,N. 


To determine the expected waiting time in the queue (excluding service time) for 
priority class k, merely subtract 1/ u from W; the corresponding expected queue length 
is again obtained by multiplying by A,. For the special case where s = 1, the expres- 
sion for A reduces to A = p?/A. 

Now consider the case of preemptive priorities, whereby the lowest priority 
customer being served is preempted (ejected back into the queue) whenever a higher 
priority customer enters the queueing system. A server is thereby freed to begin 
servicing the new arrival immediately. (When a server does succeed in finishing a 
service, the next customer to begin receiving service is selected in just the same way 
as described earlier, so a preempted customer normally will get back into service again 
and, after enough tries, eventually finish.) If the other assumptions of the preceding 
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model are retained, this preemption feature changes the rotal expected waiting time 
in the system (including total service time) to 


i 
E. 


= : fork = 1,2,...,N, 
B,_, © By 


for the single-server case (s = 1). When s > 1, the W, can be calculated by an 
iterative procedure that will be illustrated soon by the County Hospital example. The 
L, just defined continue to satisfy the relationship 


L, = AW, fork =1,2,...,N. 


The corresponding results for the queue (excluding customers in service) also can be 
obtained from W, and L, as just described for the case of nonpreemptive priorities. 
Because of the lack of memory property of the exponential distribution (see Sec. 
16.4), preemptions do not affect the service process (occurrence of service comple- 
tions) in any way. The expected total service time for any customer still is 1/4. This 
lack of memory property also implies that we don’t need to worry about defining the 
point at which service begins when a preempted customer returns to service; the 
distribution of the remaining service time always is the same. (For any other service- 
time distribution, it becomes important to distinguish between preemptive-resume 
systems, where service for a preempted customer resumes at the point of interruption, 
and preemptive-repeat systems, where service must start at the beginning again.) 

Some results are also available' for a few other single-server, priority-discipline 
models involving other service-time distributions and/or unequal expected service 
times. 


EXAMPLE: For the County Hospital emergency room problem, the Management 
Engineer has noticed that the patients are not treated on a first-come-first-served basis. 
Rather, the admitting nurse seems to divide the patients into roughly three categories: 
(1) critical cases, where prompt treatment is vital for survival; (2) serious cases, 
where early treatment is important to prevent further deterioration; and (3) stable 
cases, where treatment can be delayed without adverse medical consequences. Patients 
are then treated in this order of priority, where those in the same category are normally 
taken on a first-come-first-served basis. A doctor will interrupt treatment of a patient 
if a new case in a higher priority category arrives. Approximately 10 percent of the 
patients fall into the first category, 30 percent into the second, and 60 percent into 
the third. Because the more serious cases will be sent into the hospital for further care 
after receiving emergency treatment, the average treatment time by a doctor in the 
emergency room actually does not differ greatly among these categories. 

The Management Engineer has decided to use the priority-discipline queueing 
model just described as a reasonable representation of this queueing system, where 
the three categories of patients constitute the three priority classes in the model. 
Because treatment is interrupted by the arrival of a higher priority case, the preemptive 
priorities version of this model is the appropriate one. Given the previously available 
data (u = 3 and A = 2), the preceding percentages yield A, = 0.2, A, = 0.6, A; 
= 1.2. Table 16.4 gives the resulting expected waiting times in the queue (so ex- 


' See Selected Reference 4. 


Table 16.4 Steady-State Results from Priority- 
Discipline Model for County Hospital Problem 
a a oa a A SAR 








Preemptive Nonpreemptive 
Priorities Priorities 
[i 
s=1 s=2 s= 1 s=2 
A —- — 4.5 36 
B, 0.933 — 0.933 0.967 
B, 0.733 — 0.733 0.867 
B, 0.333 — 0.333 0.667 
l 
Wem 0.024 | 0.00037 | 0.238 0.029 
H 
1 
W,- 0.154 | 0.00793 | 0.325 0.033 
H 
1 
W; == 1.033 | 0.06542 | 0.889 0.048 
u 














cluding treatment time) in hours for the respective priority classes! when there is one 
doctor (s = 1) or two doctors (s = 2) on duty. (The corresponding results for the 
nonpreemptive priorities version of the model also are given in Table 16.4 to show 
the effect of preempting.) 

These preemptive priority results for s = 2 were obtained as follows. Because 
the waiting times for priority class 1 customers are completely unaffected by the 
presence of customers in lower priority classes, W, would be the same for any other 
values of A, and A3, including A, = 0. A; = 0. Therefore, W, must equal W for the 
corresponding one-class model (the M/M/s model in Sec. 16.6) with s = 2, uw = 
3, and A = A, = 0.2, which yields 

W, = WwW for A = 0.2 


= 0.33370, 


l 
so W, — — = 0.33370 — 0.33333 = 0.00037. 

u 
Now consider the first two priority classes. Again note that customers in these classes 
are completely unaffected by lower priority classes (just priority class 3 in this case), 
which can therefore be ignored in the analysis. Let W,_, be the expected waiting 
time in the system (so including service time) of a random arrival in either of these 
two classes, so the probability is A\/(A,; + à) = J that this arrival is in class | 
and A,/(A, + Ax) = 4 that it is in class 2. Therefore. 

W, = iW, + UWy. 


Furthermore, because expected waiting time is the same for any queue discipline, 
W,_> must also equal W for the M/M/s model in Sec. 16.6. with s = 2, u = 3, 
and A = A, + A, = 0.8, which yields 


W,.=W  tordA = 0.8 


= 0.33937. 


| Note that these expected times can no longer be interpreted as the expected time before treatment begins 
when k > 1, because treatment may be interrupted at least once, causing additional waiting time, before 
being completed. 
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Combining these facts gives 


W, 


il 


4 l 
3 [0.33937 = 5 0.33370) | = 0.34126. 


li 


l 

(v e 0.00793 
H 

Finally, let W,_3 be the expected waiting time in the system (so including service 

time) for a random arrival in any of the three priority classes, so the probabilities are 

0.1, 0.3, and 0.6 that it is in classes 1, 2, and 3, respectively. Therefore, 


W,_, = 0.1W, + 0.3W, + 0.6W3. 


Furthermore, W,_3 must also equal W for the M/M/s model in Sec. 16.6, with s = 
2, u = 3,and A = A, + A, + Ax = 2, so that (from Table 16.2) 


W,-,=W for A = 2 


0.375. 
Consequently, 


1 
W, = 0G [0.375 — 0.1(0.33370) — 0.3(0.34126)] 


= 0.39875. 
l 

(v, ->= 0.06542.) 
H 


The corresponding W, results for the M/M/s model in Sec. 16.6 also could have been 
used in exactly the same way to derive the [W, — (1/)] quantities directly. 

When s = 1, the [W, — (1/,)] values in Table 16.4 for the preemptive priorities 
case indicate that providing just a single doctor would cause critical cases to wait 
about 14 minutes (0.024 hour) on the average, serious cases to wait more than 9 
minutes, and stable cases to wait more than an hour. (Contrast these results with the 
average wait of W, = 3 hour for all patients that was obtained in Table 16.2 under 
the first-come-first-served queue discipline.) However, these values represent statis- 
tical expectations, so some patients have to wait considerably longer than the average 
for their priority class. This wait would not be tolerable for the critical and serious 
cases, where a few minutes can be vital. By contrast, the s = 2 results in Table 16.4 
(preemptive priorities case) indicate that adding a second doctor would virtually elimi- 
nate waiting for all but the stable cases. Therefore. the Management Engineer rec- 
ommended that there be two doctors on duty in the emergency room during the early 
evening hours next year. The Board of Directors for County Hospital adopted this 
recommendation and simultaneously raised the charge for using the emergency room! 


16.9 Queueing Networks 


Thus far we have considered only queueing systems that have a single service facility 
with one or more servers. However, queueing systems encountered in operations 
research studies are sometimes actually queueing networks, i.e., networks of service 


facilities where customers must receive service at some or all of these facilities. For 
example, orders being processed through a job shop must be routed through a sequence 
of machine groups (service facilities). It is therefore necessary to study the entire 
network to obtain such information as expected total waiting time, expected number 
of customers in the entire system, and so forth. 

Because of the importance of queueing networks, research into this area has 
been very active. However, this is a difficult area, and most of the work has been 
confined to cases with a Poisson input process and exponential service times. Many 
of the results that have been obtained! have necessarily been quite involved and 
unsuitable for general routine use. 

However, there is available one simple result that is of such fundamental im- 
portance for queueing networks that this finding and its implications warrant special 
attention here. This fundamental result is the following equivalence property for the 
input process of arriving customers and the output process of departing customers for 
certain queueing systems. 


Equivalence property: Assume that a service facility with s servers and an 
infinite queue has a Poisson input with parameter A and the same exponential 
service-time distribution with parameter u for each server (the M/M/s model), 
where su > A. Then the steady-state output of this service facility is also a 
Poisson process with parameter A.* 


Notice that this property makes no assumption about the type of queue discipline 
used. Whether it be first-come-first-served, random, or even a priority discipline as 
in Sec. 16.8, the served customers will leave the service facility according to a Poisson 
process. The crucial implication of this fact for queueing networks is that if these 
customers must then go to another service facility for further service, this second 
facility also will have a Poisson input. With an exponential service-time distribution, 
the equivalence property will hold for this facility as well, which can then provide a 
Poisson input for a third facility, etc. We discuss the consequences for two basic 
kinds of networks next. 


Infinite Queues in Series 


Suppose that customers must all receive service at a series of m service facilities in a 
fixed sequence. Assume that each facility has an infinite queue (no limitation on the 
number of customers allowed in the queue), so that the series of facilities form a 
system of infinite queues in series. Assume further that the customers arrive at the 
first facility according to a Poisson process with parameter A and that each facility 
i(i = 1,2,... ,m) has the same exponential service-time distribution with parameter 
H; for its servers, where su; > A. It then follows from the equivalence property that 
(under steady-state conditions) each service facility has a Poisson input with parameter 
A. Therefore, the elementary M/M/s model of Sec. 16.6 (or its priority-discipline 
counterpart in Sec. 16.8) can be used to analyze each service facility independently 
of the others! 


! For a survey, see Lemoine, Austin J.: “Networks of Queues—A Survey of Equilibrium Analysis,” 
Management Science, 24(4):464-481, 1977. 


? For a proof, see Burke, P. J.: ‘‘The Output of a Queueing System,’’ Operations Research, 4(6):699-704, 
1956. 
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Being able to use the M/M/s model to obtain all measures of performance for 
each facility independently, rather than analyzing interactions between facilities, is a 
tremendous simplification.. For example, the probability of having n customers at a 
given facility is given by the formula for P,, given in Sec. 16.6 for the M/M/s model. 
The joint probability of n, customers at facility 1, n, customers at facility 2, etc., 
then, is the product of the individual probabilities obtained in this simple way. In 
particular, this joint probability can be expressed as 


P{(N,, No, o- o, Nu) = My, Ma -- +s Mm} = P,P 


(This simple form for the solution is called the product form solution.) Similarly, 
the expected total waiting time and the expected number of customers in the entire 
system can be obtained by merely summing the corresponding quantities obtained at 
the respective facilities. 

Unfortunately, the equivalence property and its implications do not hold for the 
case of finite queues discussed in Sec. 16.6. This case is actually quite important in 
practice, because there is often a definite limitation on the queue length in front of 
service facilities in networks. For example, only a small amount of buffer storage 
space is typically provided in front of each facility (station) in a production-line 
system. For such systems of finite queues in series, no simple product form solution 
is available. The facilities must be analyzed jointly instead, and only limited results 
have been obtained. 


Jackson Networks 


Systems of infinite queues in series are not the only queueing networks where the 
M/M/s model can be used to analyze each service facility independently of the others. 
Another prominent kind of network with this property (a product form solution) is the 
Jackson network, named after the individual who first characterized the network and 
showed that this property holds.! 

The characteristics of a Jackson network are the same as assumed above for the 
system of infinite queues in series, except now the customers visit the facilities in 
different orders (and may not visit them all). For each facility, its arriving customers 
come from both outside the system (according to a Poisson process). and from. the 
other facilities. These characteristics are summarized below. 


A Jackson network is a system of m service facilities where facility i 
Gi = 1,2,..., m) has: 


1. An infinite queue; 

2. Customers arriving from outside the system according to a Poisson input process 
with parameter a,; and 

3. s; servers with the same exponential service-time distribution with parameter y,. 


A customer leaving facility i is routed next to facility j (j = 1,2,...,m, but j # i) 
with probability p;; or departs the system with probability 


q=1- ` py 
j=l 


jFi 


1 See Jackson, James R.: ‘‘Jobshop-Like Queueing Systems,’’ Management Science, 10(1):131—142, 1963. 


Any such network has the following key property. 


Under steady-state conditions, each facility j (j = 1, 2,..., m) in a Jackson 
network behaves as if it were an independent M/M/s queueing system with arrival 
rate 


m 
Aj =a; + 2 AP ip 
= 
ij 
where s u; > Aj. 


This key property cannot be proven directly from the equivalence property this 
time (the reasoning would become circular), but its intuitive underpinning is still 
provided by the latter property. The intuitive viewpoint (not quite technically correct) 
is that, for each facility i, its input processes from the various sources (outside and 
other facilities) are independent Poisson processes, so the aggregate input process is 
Poisson with parameter A; (Property 6 in Sec. 16.4). The equivalence property then 
says that the aggregate output process for facility i must be Poisson with parameter 
A;. Disaggregating this output process (Property 6 again), the process for customers 
going from facility i to facility j must be Poisson with parameter A;p;;. This process 
becomes one of the Poisson input processes for facility j, thereby helping to maintain 
the series of Poisson processes in the overall system. 

The above equation for obtaining A; is based on the fact that A; is the departure 
rate as well as the arrival rate for all customers using facility i. Because p,; is the 
proportion of customers departing from facility i that go next to facility j, the rate at 
which customers from facility i arrive at facility j is A,p,. Summing this product over 
all i # j, and then adding this sum to a,, gives the total arrival rate to facility j from 
all sources. 

To calculate 4; from this equation requires knowing the A, for i # j, but these 
A; also are unknowns given by the corresponding equations. Therefore, the procedure 
is to solve simultaneously for A,, Ao, . . . , Àm by obtaining the simultaneous solution 
of the entire system of linear equations for A; for j = 1, 2,... , m. 

To illustrate these calculations, consider a Jackson network with three service 
facilities that have the parameters shown in Table 16.5. Plugging into the formula for 
À; for j = 1, 2, 3, the resulting system of equations is 


A= 1 + 0.1A, + 0.4A, 
M = 4 + 0.6A, + 0.4A; 
Ay = 3 + 0.3A, + 0.3A5. 


(Reason through each equation to see why it gives the total arrival rate.) The simul- 
taneous solution for this system is 


A= 5, 


> 
N 
ll 
pá 
= 
> 
w 
Il 
~ 
tol 


Table 16.5 Data for the Example of a J ackson 
Network 


Facility j 
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Given this simultaneous solution, each of the three service facilities now can 
be analyzed independently by using the formiilas for the M/M/s model given in Sec. 
16.6. For example, to obtain the distribution of the number of customers, N; = n;, 
at facility i, note that 


a 4, fori = 1 
p=—+=45, fori= 
Sibi a fori = 3. 


Plugging these values (and the parameters in Table 16.5) into the formula for P,, gives 


P,, = O", for facility 1 
4, form = 0 

Pa = 4% for n, = 1, for facility 2 
HeT, — forn, = 2 


1/3 
Pi, = 6)”, 


for facility 3. 


The joint probability of (n,, na, 3) then is given simply by the product form solution, 
PAN, Nz, N3) = (nis Mo, n3} = Py PayPry 


ny 
In a similar manner, the expected number of customers L, at facility i can be 
calculated from Sec. 16.6 as 


L, = 1, Taes 


L = 3. 
The expected total number of customers in the entire system then is 
L=L,+1,+ 1, = 53. 


Obtaining W, the expected total waiting time in the system (including service 
times) for a customer, is a little trickier. You cannot simply add the expected waiting 
times at the respective facilities, because a customer does not necessarily visit each 
facility exactly once. However, Little’s. formula can still be used, where the system 
arrival rate A is the sum of the arrival rates from outside to the facilities, A = a, + 
a, + a, = 8. Thus 


ele eee 
ata +a, 3° 

In conclusion, we should point out that there do exist other (more complicated) 
kinds of queueing networks as well, where the individual service facilities can be 
analyzed independently from the others. In fact, finding queueing networks with a 


product form solution has been the Holy Grail for research on queueing networks. 
One source of additional information is Selected Reference 5. 


16.10 Conclusions 


Queueing systems are prevalent throughout society. The adequacy of these systems 
can have an important effect on the quality of life and productivity. 

Queueing theory studies queueing systems by formulating mathematical models 
of their operation and then using these models to derive measures of performance. 


This analysis provides vital information for effectively designing queueing systems 
that achieve an appropriate balance between the cost of providing a service and the 
cost associated with waiting for that service. 

This chapter presented the most basic models of queueing theory for which 
particularly useful results are available. However, many other interesting models could 
be considered if space permitted. In fact, several thousand research papers formulating 
and/or analyzing queueing models have already appeared in the technical literature, 
and many more are being published each year! 

The exponential distribution plays a fundamental role in queueing theory for 
representing the distribution of interarrival and service times, because this assumption 
enables us to represent the queueing system as a continuous time Markov chain. For 
the same reason, phase-type distributions such as the Erlang distribution, where the 
total time is broken down into individual phases having an exponential distribution, 
- are very useful. Useful analytical results have been obtained for only a relatively few 
queueing models making other assumptions. 

Priority-discipline queueing models are useful for the common situation where 
some categories of customers are given priority over others for receiving service. 

Another common situation is where customers must receive service at several 
different service facilities. Models for queueing networks are gaining widespread use 
for such situations. This is an area of especially active ongoing research. 

When no tractable model that provides a reasonable representation of the 
queueing system under study is available, a common approach is to obtain relevant 
performance data by developing a computer program for simulating the operation of 
the system. This technique is discussed in Chap. 23. 

Chapter 17 describes how queueing theory can be used to help design effective 
queueing systems. 
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PION WS 


PROBLEMS! 


1.* Consider a typical barber shop. Demonstrate that it is a queueing system by de- 
scribing its components. 


2. Identify the customers and the servers in the queueing system in each of the following 
situations: 

(a) The checkout stand in a grocery store. 

(b) A fire station. 


! See also the end of Chap. 17 for problems involving the application of queueing theory. 
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(c) The toll booth for a bridge. 

(d) A bicycle repair shop. 

(e) A shipping dock. 

(f) A group of semiautomatic machines assigned to one operator. 
(g) The materials-handling equipment in a factory area. 

(h) A plumbing shop. 

(i) A job shop producing custom orders. 

(j) A secretarial typing pool. 


3. Suppose that a queueing system has two servers, an exponential interarrival time 
distribution with a mean of 2 hours, and an exponential service-time distribution with a mean 
of 2 hours. Furthermore, a customer has just arrived at 12:00 noon. 

(a) What is the probability that the next arrival will come before 1:00 p.M.? Between 

1:00 and 2:00 p.m.? After 2:00 P.M.? 
(b) Suppose that no additional customers arrive before 1:00 p.m. Now what is the 
probability that the next arrival will come between 1:00 and 2:00 p.m.? 

(c) What is the probability that the number of arrivals between 1:00 and 2:00 P.M. will 

be zero? One? Two or more? 

(d) Suppose that both servers are serving customers at 1:00 p.m. What is the probability 

that neither customer will have service completed before 2:00 p.M.? Before 1:10 
P.M.? Before 1:01 P.M.? 


4.* The jobs to be performed on a: particular machine arrive according to a Poisson 
input process with a mean rate of two. per hour. Suppose that the machine breaks down and 
will require 1 hour to be repaired. What is the probability that the number of new jobs that 
will arrive during this time is (a) zero, (b) two, (c) five or more? 


5. A service station has one gasoline pump: Cars wanting gasoline arrive according to 
a Poisson process at a mean rate of 15 per hour. However, if the pump already is being used, 
these potential customers may balk (drive on to another service station). In particular, if there 
are n cars already at the service station, the probability that an arriving potential customer will 
balk is n/3 forn = 1, 2, 3. The time required to service a car has an exponential distribution 
with a mean of 4 minutes. 

(a) Construct the rate diagram for this queueing system. 

(b) Develop the balance equations. 

(c) Solve these equations to find the steady-state probability distribution of the number 
of cars at the station. Verify that this solution is the same as that given by the general 
solution for the birth-and-death process. 

(d) Find the expected waiting time (including service) for those cars that stay. 


6. Consider a variation of the M/M/1 model where customers renege (leave the 
queueing system without being served) if their waiting time in the queue grows too large. In 
particular, assume that the time each customer is willing to wait in the queue before reneging 
has an exponential distribution with a mean of 1/0. 

(a) Construct the rate diagram for this queueing system. 

(b) Develop the balance equations. 


7.* A certain small grocery store has a single checkout stand with a full-time cashier. 
Customers arrive at the stand ‘‘randomly’’ (i.e., a Poisson input process) at a mean rate of 30 
per hour. When there is only one customer at the stand, he is processed by the cashier alone, 
with an expected service time of 1.5 minutes. However, the stock boy has been given standard 
instructions that whenever there is more than one customer at the stand, he is to help the cashier 
by bagging the groceries. This help reduces the expected time required to process a customer 
to 1 minute. In both cases, the service-time distribution is exponential. 

(a) Construct the rate diagram for this queueing system. 


(b) What is the steady-state probability distribution of the number of customers at the 
checkout stand? 
(c) Derive L for this system. Use this information to determine L,, W, and W,. 


8. Consider a self-service model in which the customer is also the server. Note that this 
corresponds to having an infinite number of servers available. Customers arrive according to a 
Poisson process with parameter A, and service times have an exponential distribution with 
parameter m. 

(a) Find L, and W, 

(b) Construct the rate diagram for this queueing system. 

(c) Use the balance equations to find the expression for P, in terms of Pp. 

(d) Find Po. 

(e) Find L and W. 


9. For each of the following models, write the balance equations and show that they 
are satisfied by the solution given in Sec. 16.6. 

(a) The M/M/1 model. 

(b) The finite queue variation of the M/M/1 model, with K = 2. 

(c) The finite calling population variation of the M/M/1 model, with N = 2. 


10. Consider the M/M/s model. 

(a) Suppose there is one server and the expected service time is exactly 1 minute. 
Compare L for the cases where the mean arrival rate is 0.5, 0.9, and 0.99 customers 
per minute, respectively. Do the same for L,, W, W,, and P{W > 5}. 

(b) Now suppose there are two servers and the expected service time is exactly 2 min- 
utes. Make the same comparisons as for part (a). 


11. A bank employs four tellers to serve its customers. Customers arrive according to 
a Poisson process at a mean rate of three per minute. If a customer finds all tellers busy, he 
joins a queue that is serviced by all tellers; that is, there are no lines in front of each teller, but 
rather one line waiting for the first available teller. The transaction time between the teller and 
customer has an exponential distribution with a mean of 1 minute. 

(a) Construct the rate diagram for this queueing system. 

(b) Find the steady-state probability distribution of the number of customers in the bank. 

(c) Find L,, W,, W, and L. 


12.* Jobs arrive at a particular work center according to a Poisson input process at a 
mean rate of two per day, and the operation time has an exponential distribution with a mean 
of ł day. Enough in-process storage space is provided at the work center to accommodate three 
jobs in addition to the one being processed, whereas excess jobs are stored temporarily in a 
less convenient location. For what proportion of the time will this storage space at the work 
center be adequate to accommodate all waiting jobs? 


13. It is necessary to determine how much in-process storage space to allocate to a 
particular work center in a new factory. Jobs arrive at this work center according to a Poisson 
process with a mean rate of three per hour, and the time required to perform the necessary 
work has an exponential distribution with a mean of 0.25 hour. Whenever the waiting jobs 
require more in-process storage space than has been allocated, the excess jobs are stored 
temporarily in a less convenient location. If each job requires 1 square foot of floor space while 
it is in in-process storage at the work center, how much space must be provided to accommodate 
all waiting jobs (a) 50 percent of the time? (b) 90 percent? (c) 99 percent? Hint: The sum of 
a geometric series is 

5 oie t= Nt 


n=0 =y 
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14. Section 16.6 gives the following equations for the M/M/1 model: 
(1) PW >} = >) P,PUSi41 > h 
n=0 


(2) P{W > th = eT e-em, 
Show that Eq. (1) reduces algebraically to Eq. (2). 


15. Derive W, directly for the following cases by developing and reducing an expression 
analogous to Eq. (1), Prob. 14. (Hint: Use the conditional expected waiting time in the queue 
given that a random arrival finds n customers already in the system.) 

(a) The M/M/1 model. 

(b) The M/M/s model. 


16. For the finite queue variation of the M/M/1 model, develop an expression analogous 
to Eq. (1) in Prob. 14 for the following probabilities: 

(a) PEW >p. 

b) P{W, > th. 


[Hint: Arrivals can occur only when the system is not full, so the probability that a random 
arrival finds n customers already there is P,/(1 — Px).] 


17. An airline ticket office has two ticket agents answering incoming phone calls for 
flight reservations. In addition, one caller can be put on hold until one of the agents is available 
to take the call. If all three phone lines (both agent lines and the hold line) are busy, a potential 
customer gets a busy signal, and it is assumed that the call goes to another ticket office and 
that the business is lost. The calls. and attempted calls occur randomly (i.e., according to a 
Poisson process) at a mean rate of 15 per hour. The length of a telephone conversation has an 
exponential distribution with a mean of 4 minutes. 

(a) Construct the rate diagram for this queueing system. 

(b) Find the steady-state probability that 

G) A caller will get to talk to an agent immediately, 
(ii) The caller will be put on hold, 
(iii) The caller will get a busy signal. 


18. The Copy Shop is open 5 days per week for copying materials that are brought to 
the shop. It has three identical copying machines, but only two operators are kept on duty to 
run the machines, so the third machine is a spare that is used only when one of the other 
machines breaks down. When a machine is being used, the time until it breaks down has an 
exponential distribution with a mean of 2 weeks. If a machine breaks down while the other 
two are operational, a repairman is called in to repair it, in which case the total time from the 
breakdown until the repair is completed has an exponential distribution with a mean of 0.2 
week. However, if a second machine breaks down before the first one has been repaired, the 
third machine is shut off while the two operators work together. to repair this second machine 
quickly, in which case its repair time has an exponential distribution with a mean of only 7s 
week. If the repairman finishes repairing the first machine before the two operators complete 
the repair of the second, the operators go back to running the two operational machines while 
the repairman finishes the second repair, in which case the remaining repair time has an ex- 
ponential distribution with a mean of 0.2 week. 

(a) Letting the state of the system be the number of machines not working, construct 

the rate diagram for this queueing system. 

(b) Find the steady-state distribution of the number of machines not working. 

(c) What is the expeeted number of operators available for copying? 


19. Consider a single-server queueing system with a finite queue that can hold a max- 
imum of two customers excluding any being served. The server can provide batch service to 
two customers simultaneously, where the service time has an exponential distribution with a 


mean of one unit of time regardless of the number being served. Whenever the queue is not 
full, customers arrive individually according to a Poisson process at a mean rate of one per 
unit of time. 

(a) Assume that the server must serve two customers simultaneously. Thus, if the server 
is idle when only one customer is in the system, the server must wait for another 
arrival before beginning service. Formulate the queueing model as a continuous time 
Markov chain by defining the states and then constructing the rate diagram. Give 
the balance equations, but do not solve further. 

(b) Now assume that the batch size for a service is two only if two customers are in 
the queue when the server finishes the preceding service. Thus, if the server is idle 
when only one customer is in the system, the server must serve this single customer, 
and any subsequent arrivals must wait in the queue until service is completed for 
this customer. Formulate the resulting queueing model as a continuous time Markov 
chain by defining the states and then constructing the rate diagram. Give the balance 
equations, but do not solve further. 


20. Consider a queueing system that has two classes of customers, two clerks providing 
service, and no queue. Potential customers from each class arrive according to a Poisson 
process, with a mean arrival rate of 10 customers per hour for class 1 and 5 customers per hour 
for class 2, but these arrivals are lost to the system if they cannot immediately enter service. 

Each customer of class 1 that enters the system will receive service from either one of 
the clerks that is free, where the service times have an exponential distribution with a mean of 
5 minutes. 

Each customer of class 2 that enters the system requires the simultaneous use of both 
clerks (the two clerks would work together as a single server), where the service times have 
an exponential distribution with a mean of 5 minutes. Thus, an arriving customer of this kind 
would be lost to the system unless both clerks are free to begin service immediately. 

(a) Formulate the queueing model as a continuous time Markov chain by defining the 

states and constructing the rate diagram. 

(b) Construct the balance equations and use them to solve for the steady-state joint 

distribution of the number of customers of each class in the system. 

(c) For each of the two classes of customers, what is the expected fraction of arrivals 

that are unable to enter the system? 


21.* Plans are being made to open a small car-wash operation, and the owner must 
decide how much space to provide for waiting cars. It is estimated that customers would arrive 
randomly (i.e., a Poisson input process) with a mean rate of one every 4 minutes, unless the 
waiting area is full, in which case the arriving customers would take their cars elsewhere. The 
time that can be attributed to washing one car has an exponential distribution with a mean of 
3 minutes. Compare the expected fraction of potential customers that would be lost because of 
inadequate waiting space if (a) zero spaces (not including the car being washed) were to be 
provided. (b) Two spaces. (c) Four spaces. 


22. Consider the finite queue variation of the M/M/s model. Derive the expression for 
L, given in Sec. 16.6 for this model. 


23.* Suppose that one repairman has been assigned the responsibility of maintaining 
three machines. For each machine, the probability distribution of the running time before a 
breakdown is exponential, with a mean of 9 hours. The repair time also has an exponential 
distribution, with a mean of 2 hours. 
(a) Calculate the steady-state probability distribution and the expected number of ma- 
chines that are not running. 
(b) As a crude approximation, assume that the calling population is infinite, so that the 
input process is Poisson with a mean arrival rate of three every 9 hours. Compare 
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the result from part (a) with that obtained by making this approximation using (1) 

the corresponding infinite queue model and (2) the corresponding finite queue model. 
(c) Now suppose that a second repairman is available whenever more than one of these 

three machines requires repair. Calculate the information specified in part (a). 


24. Plans are currently being developed for a new factory. One department has been 
allocated a large number of automatic machines of a certain type, and we need to determine 
how many machines should be assigned to each operator for servicing (loading, unloading, 
adjusting, setup, and so on). For the purpose of this analysis, the following information has 
been provided. 

The running time (time between completing service and the machine requiring service 
again) of each machine has an exponential distribution, with a mean of 150 minutes. The 
service time has an exponential distribution, with a mean of 15 minutes. Each operator attends 
to her own machine; she does not give help to or receive help from other operators. For the 
department to achieve the required production rate, the machines must be running at least 89 
percent of the time on the average. 

(a) What is the maximum number of machines that can be assigned to an operator while 

still achieving the required production rate? 

(b) Given that the maximum number found in part (a) is assigned to each operator, what 

is the expected fraction of time that the operators will be busy servicing machines? 


25. Consider a single-server queueing system. It has been observed that this server seems 
to speed up as the number of customers in the system increases, and that the pattern of accel- 
eration seems to fit the state-dependent model presented at the end of Sec. 16.6. Furthermore, 
it is estimated that the expected service time is 8 minutes when there is only one customer in 
the system. Determine the pressure coefficient c for this model for the following cases: 

(a) The expected service time is estimated to be 4 minutes when there are four customers 

in the system. 

(b) The expected service time is estimated to be 5 minutes when there are four customers 

in the system. 


26. For the state-dependent model presented at the end of Sec. 16.6, show the effect 
of the pressure coefficient c by using Fig. 16:10 to construct a table giving the ratio (expressed 
as a decimal number) of L for this model to L for the corresponding M/M/s model (that is, 
with c = 0). Tabulate these ratios for Aj/s, = 0.5, 0.9, 0.99 when c = 0.2, 0.4, 0.6, and 
s= 1,2. 


27.* Consider the M/G/1 model. 

(a) Compare the expected waiting time in the queue if the service-time distribution is 
(i) exponential, (ii) constant, (iii) Erlang with the amount of variation (i.e., the 
standard deviation) halfway between the constant and exponential cases. 

(b) What is the effect on the expected waiting time in the queue and on the expected 
queue length if both A and u are doubled and the scale of the service-time distribution 
is changed accordingly? 


28. Consider a queueing system with a Poisson input, where the server must perform 
two distinguishable tasks in sequence for each customer, so the total service time is the sum 
of the two task times (which are statistically independent). 

(a) Suppose that the first task time has an exponential distribution with a mean of 3 
minutes and the second task time has an Erlang distribution with a mean of 9 minutes 
and with the shape parameter k = 3. Which queueing theory model should be used 
to represent this system? 

(b) Suppose that part (a) is modified so that the first task time also has an Erlang 
distribution with the shape parameter k = 3 (but with the mean still equal to 3 
minutes). Which queueing theory model should be used to represent this system? 


29.* An airline maintenance base has facilities for overhauling only one airplane engine 
at a time. Therefore, to return the airplanes to use as soon as possible, the policy has been to 
stagger the overhauling of the four engines of each airplane. In other words, only one engine 
is overhauled each time an airplane comes into the shop. Under this policy, airplanes have 
arrived according to a Poisson process at a mean rate of one per day. The time required for an 
engine overhaul (once work has begun) has an exponential distribution with a mean of $ day. 

A proposal has been made to change the policy so that all four engines are overhauled 
consecutively each time an airplane comes into the shop. Although this would quadruple the 
expected service time, each plane would need to come into the shop only one-fourth as often. 

Use queueing theory to compare the two alternatives on a meaningful basis. 


30. Consider a shoe repair shop with a single repairman. Pairs of shoes are brought in 
to be repaired (on a first-come-first-served basis) according to a Poisson process at a mean rate 
of one pair per hour. The time required to repair each individual shoe has an exponential 
distribution with a mean of 15 minutes. 

(a) Consider the formulation of this queueing system where the individual shoes (not 
pairs of shoes) are considered to be the customers. For this formulation, construct 
the rate diagram and develop the balance equations, but do not solve further. 

(b) Now consider the formulation of this queueing system where the pairs of shoes are 
considered to be the customers. Identify the specific queueing model that fits this 
formulation, and then use the available results for this model to calculate the expected 
waiting time W. (Interpret W to be the expected waiting time until both shoes in a 
pair have been repaired.) 


31. Airplanes arrive at a certain maintenance base for overhaul according to a Poisson 
process with parameter A = 3 (per week). Service times have an Erlang distribution with 
parameters = 4 (per week) and k = 2. Only one airplane can be overhauled at a time. 

(a) How should the states of the system be defined in order to formulate the queueing 

model as a continuous time Markov chain? 

(b) Construct the corresponding rate diagram. 

(c) What are L}, L, W,, and W for this queueing system? [Hint: Do not use parts (a) 

and (b).] 


32. Consider a single-server queueing system with any service-time distribution and any 
distribution of interarrival times (the G1/G/1 model). Use only basic definitions and the re- 
lationships given in Sec. 16.2 to verify the following general relationships: 

(a) L= L; + (1 — Po). 

O L=L, + p. 

(c) Py = 1 — p. 


33. Show that 


s—l s—l1 
L= S aP, +t, +5(1- 5°.) 


n=0 n=0 
by using the statistical definitions of L and L, in terms of the P,,. 


34. A company currently has two tool cribs, each with a single clerk, in its manufac- 
turing area. One tool crib handles only the tools for the heavy machinery; the second one 
handles all other tools. However, for each crib the mechanics arrive to obtain tools at a mean 
rate of 24 per hour, and the expected service time is 2 minutes. 

Because of complaints that the mechanics coming to the tool cribs have to wait too long, 
it has been proposed that the two tool cribs be combined so that either clerk can handle either 
kind of tool as the demand arises. It is believed that the mean arrival rate to the combined two- 
clerk tool crib would double to 48 per hour, and the expected service time would continue to 
be 2 minutes. However, information is not available on the form of the probability distributions 
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for. interarrival and service times, so it is not clear which queueing model would be most 
appropriate. 

Compare the status quo and the proposal with respect to the total expected number of 
mechanics at the tool crib(s) and the expected waiting time (including service) for each me- 
chanic. Do this by tabulating these data for the four queueing models considered in Figs. 16.7, 
16.11, 16.13, and 16.14 (use k = 2 when an Erlang distribution is appropriate). 


35. Consider the model with nonpreemptive priorities presented in Sec. 16.8. Suppose 
there are just two priority classes, with A, = 4 and A, = 4. In designing this queueing system, 
you are offered the choice between the following two alternatives: (1) one fast server 
(u = 10) and (2) two slow servers (u = 5). 

Compare these alternatives with the usual four mean measures of performance (W, L, 
W,, Lo for the individual priority classes (W,, W2, Lı, La, and so forth). Which alternative is 
preferred if your primary concern is expected waiting time in the system for priority class 1 
(W,)? Which is preferred if your primary concern is expected waiting time in the queue for 
priority class 1? 


36.* A particular work center in a job shop can be represented as a single-server 
queueing system, where jobs arrive according to a Poisson process, with a mean rate of eight 
per day. Although the arriving jobs are of three distinct types, the time required to perform 
any of these jobs has the same exponential distribution, with a mean of 0.1 working day. The 
practice has been to work on arriving jobs on a first-come-first-served basis. However, it is 
important that jobs of type 1 do not have to wait very long, whereas the wait is only moderately 
important for jobs of type 2 and relatively unimportant for jobs of type 3. These three types 
arrive with a mean rate of two, four, and two per day, respectively. Because all three types 
have experienced rather long delays on the average, it has been proposed that the jobs be 
selected according to. an appropriate priority discipline instead. 

Compare the expected waiting time (including service) for each of the three types of jobs 
if the queue discipline is (a) first-come-first-served, (b) nonpreemptive priority, or (c) preemp- 
tive priority. 


37. Reconsider the County Hospital emergency room problem as analyzed in Sec. 16.8. 
Suppose that the definitions of the three categories of patients are tightened somewhat in order 
to move marginal cases into a lower category. Consequently, only 5 percent of the patients will 
qualify as critical cases, 20 percent as serious cases, and 75 percent as stable cases. Develop 
a table showing the data presented in Table 16.4 for this revised problem. 


38. One inspector has been assigned the full-time task of inspecting the output from a 
group of 10 identical machines. Jobs to be done by any one of the machines arrive according 
to a Poisson process at a mean rate of 70 per hour. The time required by a machine to perform 
each job has an exponential distribution with a mean of 6 minutes. Thus, whenever all 10 
machines are busy, the jobs are being completed ready for inspection at a mean rate of 100 per 
hour. Unfortunately, the inspector is able to inspect them at a mean rate of only 80 per hour. 
(In particular, his inspection time has an Erlang distribution with a mean of 0.75 minute and 
a shape parameter k = 25.) This inspection rate has resulted in a substantial average amount 
of in-process inventory at the inspection station (i.e., the expected number of jobs waiting to 
be inspected is fairly large), in addition to that already found at the group of machines. Man- 
agement feels that there is too much capital tied up in in-process inventory, so it has instructed 
the production manager to cut down on such inventory. Therefore, the production manager has 
made two alternative proposals to reduce the average level of in-process inventory. Proposal 1 
is to use slightly less power for the machines (which would increase their expected time to 
perform a job to 7 minutes), so that the inspector can keep up with their output better. Proposal 
2 is to substitute a certain younger inspector for this task. He is somewhat faster (albeit more 
variable in his inspection times because of less experience), so he should keep up better. (His 
inspection time would have an Erlang distribution with a mean of 0.72 minute and a shape 
parameter k = 2.) 


The production manager has asked you to “‘use the latest OR techniques to see how 
much each proposal would cut down on in-process inventory.”’ 

(a) What would be the effect of proposal 1? Why? How would you explain this outcome 
to the production manager? 

(b) Determine the effect of proposal 2. How would you explain this outcome to the 
production manager? 

(c) What suggestions would you make for reducing the average level of in-process 
inventory at the inspection station? At the group of machines? 


39. Consider the queueing-theory model with a preemptive priority queue discipline 
presented in Sec. 16.8. Suppose that s = 1, N = 2, and (A, + à) < yp, and let P} be the 
steady-state probability that there are i members of the higher priority class and j members of 
the lower priority class in the queueing system (i = 0,1,2,...3;j = 0,1,2,...). Usea 
method analogous to that presented in Sec. 16.5 to derive a system of linear equations whose 
simultaneous solution is the P,;. Do not actually obtain this solution. 


40. Consider a queueing system with two servers, where the customers arrive from two 
different sources. From Source 1, the customers always arrive two at a time, where the time 
between consecutive arrivals of pairs of customers has an exponential distribution with a mean 
of 20 minutes. Source 2 is itself a two-server queueing system, which has a Poisson input 
process with a mean rate of 7 customers per hour, and the service time from each of these two 
servers has an exponential distribution with a mean of 15 minutes. When a customer completes 
service at Source 2, it immediately enters the queueing system under consideration for another 
type of service. In the latter queueing system, the queue discipline is preemptive priority where 
customers from Source 1 always have preemptive priority over customers from Source 2. 
However, service times are independent and identically distributed for both types of customers 
according to an exponential distribution with a mean of 6 minutes. 

(a) First focus on the problem of deriving the steady-state distribution of only the number 
of Source 1 customers in the queueing system under consideration. Using a contin- 
uous time Markov chain formulation, define the states and construct the rate diagram 
for most efficiently deriving this distribution (but don’t actually derive it). 

(b) Now focus on the problem of deriving the steady-state distribution of the total 
number of customers of both types in the queueing system under consideration. 
Using a continuous time Markov chain formulation, define the states and construct 
the rate diagram for most efficiently deriving this distribution (but don’t actually 
derive it). 

(c) Now focus on the problem of deriving the steady-state joint distribution of the 
number of customers of each type in the queueing system under consideration. Using 
a continuous time Markov chain formulation, define the states and construct the rate 
diagram for deriving this distribution (but don’t actually derive it). 


41.* Consider a single-server queueing system with a Poisson input, Erlang service 
times, and a finite queue. In particular, suppose that k = 2, the mean arrival rate is two 
customers per hour, the expected service time is 0.25 hour, and the maximum permissible 
number of customers in the system is two. Derive the steady-state probability distribution of 
the number of customers in the system, and then calculate the expected number. Compare this 
result with the corresponding results when the service-time distribution is exponential. 


42. Consider the E,/M/1 model with A = 4 and u = 5. 

(a) How should the states of the system be defined in order to formulate this model as 
a continuous time Markov chain? 

(b) Construct the corresponding rate diagram. 


43. A shop contains three identical machines that are subject to a failure of a certain 
kind. Therefore, a maintenance system is provided to perform the maintenance operation (re- 
charging) required by a failed machine. The time required by each operation has an exponential 
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distribution with a mean of 30 minutes. However, with probability 3, the operation must be 
performed a second time (with the same distribution of time) in order to bring the failed machine 
back to a satisfactory operational state. The maintenance system works on only one failed 
machine at a time, performing all the operations (one or two) required by that machine, on a 
first-come-first-served basis. After a machine is repaired, the time until its next failure has an 
exponential distribution with a mean of 3 hours. 

(a) How should the states of the system be defined in order to formulate this queueing 
system as a continuous time. Markov chain? (Hint: Given that a first operation is 
being performed on a failed machine, completing this operation successfully and 
completing it unsuccessfully are two separate events of interest. Then use Property 
6 regarding disaggregation for the exponential distribution.) 

(b) Construct the corresponding rate diagram. 

(c) Develop the balance equations. 


44. A company has one repairman to keep a large group of machines in running order. 
Treating this group as an infinite calling population, individual breakdowns occur according to 
a Poisson process at a mean rate of one per hour. For each breakdown, the probability is 0.9 
that only a minor repair is needed, in which case the repair time has an exponential distribution 
with a mean of $ hour. Otherwise, a major repair is needed, in which case the repair time has 
an exponential distribution with a mean of 5 hours. Because both of these conditional distri- 
butions are exponential, the unconditional (combined) distribution of repair times is hyperex- 
ponential. 

(a) Compute the mean and standard deviation of this hyperexponential distribution. 
[Hint: Use the general relationships from probability theory that, for any ran- 
dom variable X and any pair of mutually exclusive events, E, and E,, E(X) = 
E(X|E,)P(E,) + E(X|E,)P(E,) and var(X) = E(X?) — E(X)*.] Compare this 
standard deviation with that for an exponential distribution having this mean. 

(b) What are Po, L,, L, W,, and W for this queueing system? 

(c) What is the conditional value of W, given that the machine involved requires a major 
repair? A minor repair? What is the division of L between machines requiring the 
two types of repairs? (Hint: Little’s formula still applies for the individual categories 
of machines.) 

(d) How should the states of the system be defined in order to formulate this queueing 
system as a continuous time Markov chain? (Hint: Consider what additional infor- 
mation must be given, besides the number of machines down, for the conditional 
distribution of the time remaining until the next event of each kind to be exponential.) 

(e) Construct the corresponding rate diagram. 


45. Consider a system of two infinite queues in series, where each of the two service 
facilities has a single server. All service times are independent and have an exponential distri- 
bution, with a mean of 3 minutes at facility 1 and 4 minutes at facility 2. Facility 1 has a 
Poisson input process with a mean rate of 10 per hour. 

(a) Find the steady-state distribution of the number of customers at facility 1, and then 
at facility 2. Then show the product form solution for the joint distribution of the 
number at the respective facilities. 

(b) What is the probability that both servers are idle? 

(c) Find the expected total number of customers in the system and the expected total 
waiting time (including service times) for a customer. 


46. Under the assumptions specified in Sec. 16.9 for a system of infinite queues in 
series, this kind of queueing network actually is a special case of a Jackson network. Dem- 
onstrate that this is true by describing this system as a Jackson network, including specifying 
the values of the a; and the p; given A for this system. 


47. 
shown below. 


Facility j 


Consider a Jackson network with three service facilities having the parameter values 655 
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(a) 
(b) 
(e) 


(d) 
(e) 





Find the total arrival rate at each of the facilities. 

Find the steady-state distribution of the number of customers at facility 1. At facility 
2. At facility 3. Then show the product form solution for the joint distribution of 
the number at the respective facilities. 

What is the probability that all the facilities have empty queues (no customers waiting 
to begin service)? 

Find the expected total number of customers in the system. 

Find the expected total waiting time (including service times) for a customer. 


1/7 


The Application of 
Queueing Theory 


Queueing theory has enjoyed a prominent place among the modern analytical tech- 
niques of operations research. However, the emphasis thus far has been on developing 
a descriptive mathematical theory. Thus queueing theory is not directly concerned 
with achieving the goal of operations research: optimal decision making. Rather, it 
develops information on the behavior of queueing systems. This theory provides part 
of the information needed to conduct an operations research study attempting to find 
the best design for a queueing system. 

This chapter discusses the application of queueing theory in the broader context 
of an overall operations research study. It begins by introducing three examples that 
will be used for illustration throughout the chapter. Section 17.2 discusses the basic 
considerations for decision making in this context. The following two sections then 
develop decision models for the optimal design of queueing systems. The last model 
requires the incorporation of travel-time models, which are presented in Sec. 17.5. 
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17.1 Examples 


Example 1—How Many Repairers? 


SIMULATION, INC., a small company that makes gidgets for analog computers, has 
10 gidget-making machines. However, because these machines break down and re- 
quire repair frequently, the company has only enough operators to operate eight ma- 
chines at a time, so two machines are available on a standby basis for use while other 
machines are down. Thus eight machines are always operating whenever no more 
than two machines are waiting to be repaired, but the number of operating machines 
is reduced by one for each additional machine waiting to be repaired. 

The time until any given operating machine breaks down has an exponential 
distribution, with a mean of 20 days. The time required to repair a machine also has 
an exponential distribution, with a mean of 2 days. Until now the company has had 
just one repairer to repair these machines, which has frequently resulted in reduced 
productivity because fewer than eight machines are operating. Therefore, the company 
is considering hiring a second repairer, so that two machines can be repaired simul- 
taneously. 

Thus the queueing system to be studied has the repairers as its servers and the 
machines requiring repair as its customers, where the problem is to choose between 
having one or two servers. (Notice the analogy between this problem and the County 
Hospital emergency room problem described in Sec. 16.1.) With one slight exception, 
this system fits the finite calling population variation of the M/M/s model presented 
in Sec. 16.6, where N = 10 machines, A = 1/20 customer per day (for each operating 
machine), and u = 1/2 customer per day. The exception is that the A, and A, 
parameters of the birth-and-death process are changed from Ay = 10A and A, = 9A 
to Ay = 8A and A, = 8A. (All the other parameters are the same as those given in 
Sec. 16.6.) Therefore, the C, factors for calculating the P„ probabilities change ac- 
cordingly (see Sec. 16.5). 

Each repairer costs the company approximately $70/day. However, the esti- 
mated lost profit from having fewer than eight machines operating to produce gidgets 
is $100/day for each machine down. (The company can sell the full output from eight 
operating machines, but not much more.) 

The analysis of this problem will be pursued in Secs. 17.3 and 17.4. 


Example 2—Which Computer? 


EMERALD UNIVERSITY currently has one large computer that is shared by every- 
one on campus. Because students have been experiencing long turnaround times, the 
university now is planning to lease an additional small batch-processing computer for 
the exclusive use of its students, while reserving the large computer for the other 
users. Two models are being considered: one from the MBI Corporation and the other 
from the EG Company. The MBI computer costs more but is somewhat faster than 
the EG computer. In particular, if a sequence of typical student programs were run 
continuously for 1 hour, the number completed would have a Poisson distribution 
with a mean of 30 and 25 for the MBI and the EG computers, respectively. A statistical 
study has shown that the student population actually submits programs to be run every 
3 minutes on the average during all operating hours, and that the time from one 
submission to the next has an exponential distribution with this mean. The leasing 
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cost per operating hour would be $100 for the MBI computer and $75 for the EG 
computer. 

Thus the queueing system of concern has the new small computer as its (single) 
server and the students’ programs as its customers. Furthermore, this system fits the 
M/M/1 model presented at the beginning of Sec. 16.6. With 1 hour as the unit of 
time, A = (60 minutes per hour)/(3 minutes per customer) = 20 customers per hour, 
and u = 30 and 25 customers per hour with the MBI and the EG computers, re- 
spectively. You will see in Secs. 17.3 and-17.4 how the decision was made between 
the two computers. 


Example 3—How Many Tool Cribs? 


The MECHANICAL COMPANY is designing a new plant. This plant will need to 
include one or more tool cribs in the factory area to store tools required by the shop 
mechanics. The tools will be handed out by clerks as the mechanics arrive and request 
them and will be returned to the clerks when they are no longer needed. In existing 
plants, there have been frequent complaints from foremen that their mechanics have 
had to waste too much time traveling to tool cribs and waiting to be served, so it 
appears that there should be more tool cribs and more clerks in the new plant. On the 
other hand, management is exerting pressure to reduce overhead in the new plant, 
and this reduction would lead to fewer tool cribs and fewer cribs. To resolve. these 
conflicting pressures, an operations research study is to be conducted to determine 
just how many tool cribs and clerks the new plant should have. 

Each tool crib constitutes a queueing system, with the clerks as its servers and 
the mechanics as its customers. Based on previous experience, it is estimated that the 
time required by a tool crib clerk to service a mechanic has an exponential distribution, 
with a mean of 3 minute. Judging from the anticipated number of mechanics in the 
entire factory area, it is also predicted that they would require this service randomly 
but at a mean rate of two mechanics per minute. Therefore, it was decided to use the 
M/M/s model of Sec. 16.6 to represent each queueing system. With 1 hour as the 
unit of time, u = 120. If only one tool crib were to be provided, A also would be 
120. With more than one tool crib, this mean arrival rate would be divided among 
the different queueing systems. 

The total cost to the company of each tool crib clerk is about $10/hour. The 
capital recovery costs, upkeep costs, and so forth, associated with each tool crib 
provided are estimated to be $8/working hour. While a mechanic is busy, the value 
to the company of his output averages about $24/hour. 

Sections 17.3 and 17.5 include discussions of how this (and additional) infor- 
mation was used to make the required decisions. 


17.2 Decision Making 


Queueing-type situations that require decision. making arise in a wide variety of con- 
texts. For this reason, it is not possible to present a meaningful decision-making 
procedure that is applicable to all these situations. Instead, this section attempts to 
give a broad conceptual picture of the general approach to a predominant group of 
waiting-line problems. 


A large proportion of waiting-line problems that arise in practice involve making 
one or a combination of the following decisions: 


1. Number of servers at a service facility. 
2. Efficiency of the servers. 
3. Number of service facilities. 


When such problems are formulated in terms of a queueing model, the corresponding 
decision variables usually would be s (number of servers at each facility), u (mean 
service rate per busy server), and A (mean arrival rate at each facility). The number 
of service facilities is directly related’ to A because, assuming a uniform workload 
among the facilities, à equals the total mean arrival rate to all facilities divided by 
the number of facilities. 

Refer back to Sec. 17.1 and note how the three examples there respectively 
illustrate situations involving these three decisions. In particular, the decision facing 
Simulation, Inc., is how many repairers (servers) to provide. The problem for Emerald 
University is how fast a computer (server) is needed. The problem facing Mechanical 
Company is how many tool cribs (service facilities) to install, as well as how many 
clerks (servers) to provide at each facility. 

The first kind of decision is particularly common in practice. However, the other 
two also arise frequently, particularly for the business-industrial internal service sys- 
tems described in Sec. 16.3. One example illustrating a decision on the efficiency of 
the servers is the selection of the type of materials-handling equipment (the servers) 
to purchase to transport certain kinds of loads (the customers). Another such example 
is the determination of the size of a maintenance crew (where the entire crew is one 
server). Other decisions concern the number of service facilities, such as restrooms, 
first-aid centers, drinking fountains, storage areas, and so on, to distribute throughout 
an area. 

All the specific decisions discussed here involve the general question of the 
appropriate level of service to provide in a queueing system. As mentioned at the 
beginning of Chap. 16, decisions regarding the amount of service capacity to provide 
usually are based primarily on two considerations: (1) the cost incurred by providing 
the service, as shown in Fig. 17.1, and (2) the amount of waiting for that service, as 
suggested in Fig. 17.2. Figure 17.2 can be obtained by using the appropriate waiting- 
time equation from queueing theory. 


Cost of service per arrival 


Level of service 
Figure 17.1 Service cost as a function of service level. 
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Expected waiting time 


Level of service 
Figure 17.2. Expected waiting time as a function of service level. 


It is readily apparent that these two considerations. create conflicting pressures 
on the decision maker. The objective of reducing service costs recommends a minimal 
level of service. On the other hand, long waiting times are undesirable, which rec- 
ommends a high level of service. Therefore, it is necessary to strive for some type of 
compromise. To assist us in finding this compromise, Figs. 17.1 and 17.2 may be 
combined, as shown in Fig. 17.3. The problem is thereby reduced to selecting the 
point on the curve of Fig. 17.3 that gives the best balance between the average delay 
in being serviced and the cost of providing that service. Reference to Figs. 17.1 and 
17.2 indicates the corresponding level of service. 

Unfortunately, it is all too easy to terminate further analysis and make a quick 
subjective judgment on the basis of Fig. 17.3. Actually, the most crucial portion of 
the analysis still lies ahead. An intelligent decision on the proper balance between 
delays and service costs can be made only after the relative seriousness of delays and 
service costs has been established. Obtaining the proper balance requires answers to 
such questions as ‘‘How much expenditure on service is equivalent (in its detrimental 
impact) to a customer being delayed one unit of time? Thus, to compare service 
costs and waiting times, it is necessary to adopt (explicitly or implicitly) a common 
measure of their impact. The natural choice for this common measure is cost, so that 
it becomes necessary to estimate the cost of waiting. This cost probably cannot be 
identified entirely with expenditures on the accounting books. Nevertheless, a given 
amount of waiting can be considered to be equivalent in its long-run impact (from the 
viewpoint of the decision maker) to an expenditure of a certain amount. If it is 
reasonable to assume that this cost of waiting is proportional to the total amount of 
waiting, it is sufficient to estimate the cost of waiting per unit time per arrival. 


Expected waiting time 


Cost of service per arrival 
Figure 17.3 Relationship between average delay and service cost. 


A common viewpoint in practice is that the cost of waiting is often too intangible 
to be amenable to estimation; hence the decision must instead be based on more 
tangible, even if less fundamental, criteria, such as the desired expected waiting time 
in Fig. 17.3. The fallacy in this viewpoint is that it is impossible to avoid the equivalent 
of estimating waiting costs when analyzing the problem rationally. Any comparison 
of waiting times and service costs must inevitably reduce to estimating the cost that 
is equivalent to the waiting. The only question is whether the estimation should be 
done explicitly or implicitly. The answer to this question depends, in part, upon the 
time and cost required to develop a reasonable explicit estimate. Will the time and 
cost incurred be more or less than the potential savings? It seems evident that per- 
forming the penetrating analysis required to obtain this explicit estimate should provide 
a sounder basis for the required decision than superficial criteria that are ultimately 
based on these same cost considerations in a very imprecise and intuitive way. In 
addition to using a better estimate, this explicit procedure permits using rigorous 
mathematical analysis to identify accurately the decision that minimizes the total 
estimated expected cost. 

Granted that an explicit estimate of the cost of waiting is desirable, the next 
question is how to develop this estimate. Because of the diversity of waiting-line 
situations, no single estimating process is generally applicable. However, we shall 
discuss the basic considerations involved for several types of situations. 

One broad category is where the customers are external to the organization 
providing the service; i.e., they are outsiders bringing their business to the organi- 
zation. To discuss this category meaningfully, we need to divide it further in terms 
of whether the service is being provided for profit or not for profit (on a nonprofit 
basis). Consider first the case of profit-making organizations (typified by the com- 
mercial service systems described in Sec. 16.3). From the viewpoint of the decision 
maker, the cost of waiting probably consists primarily of the lost profit from lost 
business. This loss of business may occur immediately (because the customer grows 
impatient and leaves) and/or in the future (because the customer is sufficiently irritated 
that he or she does not come again). This kind of cost is quite difficult to estimate, 
and it may be necessary to revert to other criteria, such as a tolerable probability 
distribution of waiting times. When the customer is not a human being, but a job 
being performed on order, there may be more readily identifiable costs incurred, such 
as those caused by idle in-process inventories or increased expediting and administra- 
tive effort. 

Now consider the type of situation where service is provided on a nonprofit 
basis to customers external to the organization (typical of social service systems and 
some transportation service systems described in Sec. 16.3). In this case, the cost of 
waiting usually is a social cost of some kind. Thus it is necessary to evaluate the 
consequences of the waiting for the individuals involved and/or for society as a whole 
and to try to impute a monetary value to avoiding these consequences. (We shall 
illustrate this process in the next section with Example 2—the Emerald University 
computer problem—the example that most closely fits this type of situation.) Once 
again, this kind of cost is quite difficult to estimate, and it may be necessary to revert 
to other criteria. 

A situation may be more amenable to estimating waiting costs if the customers 
are internal to the organization providing the service (as for business-industrial inter- 
nal service systems). For example, the customers may be machines (as in Example 
1) or employees (as in Example 3) of a firm. Therefore, it may be possible to identify 
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EITC) = ESC) + EWC) 







Sum of costs 






Cost of service 
E(SC) 


Expected cost 


Cost of waiting 


E(WC) 


Solution 


Level of service 
Figure 17.4 Conceptual solution procedure for many waiting-line problems. 


directly some or all of the costs associated with the idleness of these customers. (A 
detailed discussion of the estimating process for this situation is available elsewhere. ') 
To illustrate the underlying rationale (and to warn against a common pitfall), we 
consider the case where the customers are machine operators. At first glance, it is 
easy to jump to the conclusion that the relevant cost to the firm if such an employee 
waits in a queue is her wage during the waiting time. However, this conclusion implies 
that the net reduction in the earnings of the firm because an operator has to wait is 
equal to her wage. There is no reason why this particular relationship should hold in 
general. Perhaps one reason this approach is sometimes used in practice is the mis- 
conception that what is being wasted by the waiting is the operator’s wage. The 
operator actually will receive the same wage regardless of the waiting, and what is 
really lost is the contribution to the firm’s earnings that the operator would have made 
otherwise. Furthermore, the worker does not work in a vacuum; rather, she is the 
catalyst that causes the efforts of all the economic resources involved (machinery and 
equipment, materials, managerial skill, capital, and so on) to result in one of the 
changes in the product necessary to make it salable. Thus, although the output of the 
machine operator and her colleagues is essential if a salable product is to be produced, 
the sale of that product must not only pay their wages but also pay for the other 
necessary economic resources. Therefore, when the operator is idle because she is 
waiting in a queue, what is being wasted is productive output that would have helped 
pay the fixed expenses of the firm in addition to the operator’s wage. In short, rather 
than focusing solely on the value of the one économic resource that physically waits 
in the queue, the emphasis instead should be on finding the value of all the economic 
resources that would be idled as a consequence of this waiting. This approach fre- 
quently boils down to evaluating the lost profit from all lost productivity. 

Given that the cost of waiting has been evaluated explicitly, the remainder of 
the analysis is conceptually straightforward. The objective is to determine the level 
of service that minimizes the total of the expected cost of service and the expected 
cost of waiting for that service. This concept is depicted in Fig. 17.4, where WC 
denotes waiting cost, SC denotes service cost, and TC denotes total cost. Thus the 


1 Hillier, Frederick S.: ‘‘Cost Models for the Application of Priority Waiting Line Theory to Industrial 
Problems,’’ Journal of Industrial Engineering, 16(3):178-185, 1965. 


mathematical statement of the objective is to 
Minimize E(TC) = E(SC) + E(WC). 


Sections 17.3 to 17.5 are concerned with the application of this concept to 
various types of problems. Thus Sec. 17.3 describes how E(WC) can be expressed 
mathematically. Section 17.4 then focuses on E(SC) to formulate the overall objective 
function E(TC) for several basic design problems (including some with multiple de- 
cision variables, so that the level of the service axis in Fig. 17.4 actually requires 
more than one dimension). This section also introduces the fact that when a decision 
on the number of service facilities is required, time spent in traveling to and from a 
facility should be included in the analysis (as part of the total time waiting for service). 
Section 17.5 discusses how to determine the expected value of this travel time. 


17.3 Formulation of Waiting-Cost Functions 


To express E(WC) mathematically, we must first formulate a waiting-cost function 
that describes how the actual waiting cost being incurred varies with the current 
behavior of the queueing system. The form of this function depends on the context 
of the individual problem. However, most situations can be represented by one of the 
two basic forms described next. 


The g(N ) Form 


Consider first the situation discussed in the preceding section where the queueing 
system customers are internal to the organization providing the service, and so the 
primary cost of waiting may be the lost profit from lost productivity. The rate at which 
productive output is lost sometimes is essentially proportional to the number of cus- 
tomers in the queueing system. However, in many cases there is not enough productive 
work available to keep all the members of the calling population continuously busy. 
Therefore, little productive output may be lost by having just a few members idle 
waiting for service in the queueing system, whereas the loss may increase greatly if 
a few more members are made idle because they require service. Consequently, the 
primary property of the queueing system that determines the current rate at which 
waiting costs are being incurred is N, the number of customers in the system. Thus 
the form of the waiting-cost function for this kind of situation is that illustrated in 
Fig. 17.5, namely, a function of N. We shall denote this form by g(N). 

The g(N) function would be constructed for a particular situation by estimating 
g(n), the waiting-cost rate incurred when N = n, forn = 1,2, ..., where g(0) = 
0. After computing the P, probabilities for a given design of the queueing system, 
we can calculate 


E(WC) = E{g(N)}. 
Because N is a random variable, this calculation is made by using the equation for 


the expected value of a function of a discrete random variable, 


EWC) = > s(n)P,. 
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g(N) 


Waiting cost per unit time 


0 t 2 3.. n 


Number of customers in system 
Figure 17.5 The waiting-cost function as a function of N. 


When g(N) is a linear function (i.e., when the waiting-cost rate- is proportional to 
N), then 


gN) = C,N, 


where C, is the cost of waiting per unit time for each customer. In this case, E(WC) 
reduces to 


E(WC) = C, > nP, = C,L. 
n=0 


EXAMPLE 1—How Many REPAIRERS? For Example | of Sec. 17.1, Simulation, 
Inc., has two standby gidget-making machines, so there is no lost productivity as long 
as the number of customers (machines requiring repair) in the system does not exceed 
two. However, for each additional customer (up to the maximum of 10 total), the 
estimated lost profit is $100/day. Therefore, i 


ye 0, for n = 0, 1,2 
80) =" | 1000'S 2). form = 3, Aa A, 
as shown in Table 17.1. Consequently, E(WC) is calculated by summing the last 


column of Table 17.1 for each of the two cases of interest, namely, having one repairer 
(s = 1) or two repairers (s = 2). 


ll 


The A( W) Form 


Now consider the cases discussed in Sec. 17.2 where the queueing system customers 
are external to the organization providing the service. Three major types of queueing 
systems described in Sec. 16.3—commercial service systems, transportation service 
systems, and social service systems—typically fall into this category. In the case of 
commercial service systems, the primary cost of waiting may be the lost profit from 
lost future business. For transportation service systems and social systems, the primary 
cost of waiting may be in the form of a social cost. However, for either type of cost, 
its magnitude tends to be affected greatly by the size of the waiting times experienced 
by the customers. Thus the primary property of the queueing system that determines 
the waiting cost currently being incurred is W, the waiting time in the system for the 


Table 17.1 Calculation of E(WC) for Example 1 






























0 0 0.271 0 
1 0 0.217 0 0.346 0 
2 0 0.173 0 0.139 0 
3 100 0.139 14 0.055 6 
4 200 0.097 19 0.019 4 
5 300 0.058 17 0.006 2 
6 400 0.029 12 0.001 0 
7 500 0.012 6 3 x 10~4 0 
8 600 0.003 2 4x 107° 0 
9 700 | 7 x 1074 0 4x 107° 0 
10 | 800 | 7 x 1075 0 2 x 1077 0 











$70/day 


individual customers. Consequently, the form of the waiting-cost function for this 
kind of situation is that illustrated in Fig. 17.6, namely, a function of W. We shall 
denote this form by A(‘W). 

One way of constructing the ACW) function is to estimate h(w) (the waiting cost 
incurred when a customer’s waiting time W = w) for several different values of w 
and then to fit a polynomial to these points. The expectation of this function of a 
continuous random variable is then defined as 


EAW} = I, A(w)fay(w) dw, 


where fay-(w) is the probability density function of W. However, because E{h(W)} is 
the expected waiting cost per customer and E(WC) is the expected waiting cost per 
unit time, these two quantities are not equal in this case. To relate them it is necessary 
to multiply E{ACW)} by the expected number of customers per unit time entering the 
queueing system. In particular, if the mean arrival rate is a constant A, then 


E(WC) = AE{ACW)} = A I, A(w)fey(w) dw. 


AW) 


Waiting cost per customer 


Waiting time in the system 


Figure 17.6 The waiting-cost function as a function of W. 
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EXAMPLE 2—Wuicu COMPUTER? Because the students of Emerald University 
would experience different turnaround times with the two computers under consid- 
eration (see Sec. 17.1), the choice between the computers required an evaluation of 
the consequences of making students wait for their programs to be run. This evaluation 
led to the following conclusions. 

From the university’s viewpoint, there are two primary consequences of making 
students wait. First, it decreases the time available for the student to pursue other 
academic endeavors because little effective studying can be done during the wait. 
Second, it detracts from the efficiency and academic value of the computer assignment 
because it breaks the continuity in dealing with the problem, and it may prevent the 
student from completing the assignment satisfactorily. 

To evaluate the first consequence, an estimate was made that an average student 
studies at only one-third of full efficiency during the wait. Furthermore, by calculating 
the total expenditures from all sources for a student’s college education, a monetary 
figure of $15/hour was placed on the value of being able to study at full efficiency. 
Therefore, this component of waiting cost was estimated to be $10/hour, that is, 
10W, where W. is expressed in units of hours. 

To evaluate the second consequence, two groups of students were interviewed 
just after experiencing turnaround times of 3 hour and 1 hour, respectively. They were 
asked to estimate how much additional time working on the assignment had been 
caused by the break in the continuity, as well as the effect of the wait on their ability 
to complete the assignment satisfactorily. On this basis, averages of $2 and $8 were 
imputed to the value of a student being able to avoid this consequence entirely rather 
than having a wait of $ hour and 1 hour, respectively. Therefore, this component of 
the waiting cost was estimated to be 8W?. 

This analysis yields 


ACW) = 10W + 8W?. 
Because fey(w) = wl — peT Hl pw 


for the M/M/1 model (see Sec. 16.6) fitting this single-server queueing system, 
EKW} = f (10w + 8w°)u(l — peT HU plw dw. 


Using the fact that w(1 — p) = mw — A for a single-server system, the values of u 
and A presented in Sec. 17.1 give 


me 10, for the MBI computer 
k: P? =] 5, for the EG computer. 


Evaluating the integral for these two cases yields 


_ J1.16, for the MBI computer 
ERC} = ae for the EG computer. 


This result represents the expected waiting cost (in dollars) for each student arriving 
with a computer program to be run. Because A = 20, the total expected waiting cost 
per hour becomes 


EWC) = $23.20/hr, for the MBI computer 
~ | $52.80/hr, for the EG computer. 


Tue LINEAR Case: When A(W) is a linear function, 
AW) = CW, 
E(WC) reduces to 
E(WC) = AE(C,,W) = C,(AW) = C,L. 


Note that this result is identical to the result when g(N) is a linear function. Conse- 
quently, when the total waiting cost incurred by the queueing system is simply pro- 
portional to the total waiting time, it does not matter whether the g(N) or the ACW) 
form is used for the waiting-cost function. 


EXAMPLE 3—How Many Too. Crips? As indicated in Sec. 17.1, the value to 
the Mechanical Company of a busy mechanic’s output averages about $24/hour. Thus 
C,, = 24. Consequently, for each tool crib the expected waiting cost per hour is 


E(WC) = 24L, 


where L represents the expected number of mechanics waiting (or being served) at 
the tool crib. 
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17.4 Decision Models 


We mentioned in Sec. 17.2 that three common decision variables in designing 
queueing systems are s (number of servers), u (mean service rate for each server), 
and A (mean arrival rate at each service facility). We shall now formulate models for 
making some of these decisions. 


Model 1— Unknown s 


Model 1 is designed for the case where both u and A are fixed at a particular service 
facility, but where a decision must be made on the number of servers to have on duty 
at the facility. 


FORMULATION OF MODEL 1 


Definition: C, = marginal cost of a server per unit time. 
Given: h, A, Cy. 

To find: s. 

Objective: Minimize E(TC) = sC, + E(WC). 


Because only a few alternative values of s normally need to be considered, the 
usual way of solving this model is to calculate E(TC) for these values of s and select 
the minimizing one. 


EXAMPLE 1—How Many REPAIRERS? For Example 1 of Sec. 17.1, each repairer 
(server) costs Simulation, Inc., approximately $70/day. Thus, with a day as the unit 
of time, C, = 70. Using the values of E(WC) calculated in Table 17.1 then yields 
the results shown in Table 17.2, which indicate that the company should continue 
having just one repairer. 
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Table 17.2 Calculation of E(TC) for Example 1 
E(TC) 


$140/day <— minimum 
$152/day 
2=$210/day 











Model 2—Unknown m and s 


Model 2 is designed for the case where both the efficiency of service, measured by 
H, and the number of servers s at a service facility need to be selected. Alternative 
values of u may be available because there is a choice on the quality of the servers. 
(In one example, both the type and quantity of materials-handling equipment to trans- 
port certain kinds of loads must be selected for purchase.) Another possibility is that 
the speed of the servers can be adjusted mechanically. (For example, the speed of 
machines frequently can be adjusted by changing the amount of power consumed, 
which also changes the cost of operation.) Still another type of example is the selection 
of the number of crews (the servers) and the size of each crew (which determines u) 
for jointly performing a certain task, e.g., maintenance work, loading and unloading 
operations, inspection work, setup of machines, and so forth. In many cases, only a 
few alternative values of u are available, e.g., the efficiency of the alternative types 
of materials-handling equipment or the efficiency of the alternative crew sizes. 


FORMULATION OF MODEL 2 


Definitions: f(u) = marginal cost of server per unit time when mean service 


rate is u. 
A = set of feasible values of u. 
Given: A, f(m), A. 


To find: M, S. 
Objective: Minimize EC) = sf(u) + E(WC), subject to u E A. 


EXAMPLE 2— WuiICH COMPUTER? As indicated in Sec. 17.1, w = 30 for the MBI 
computer and u = 25 for the EG computer, where 1 hour is the unit of time. These 
computers are the only two being considered by Emerald University, so 


A = {25, 30}. 


Because the leasing cost per operating hour would be $75 for the EG computer (u = 
25) and $100 for the MBI computer (u = 30), 


75, for u = 25 
100, for u = 30. 


fu) = { 
The computer chosen will be the only one available for student use, so the number 
of servers (computers) for this queueing system is restricted to be s = 1. Hence 
ETC) = f(u) + EWC), 
where E(WC) is given in Sec. 17.3 for the two alternatives. Thus 
E(TC) = 75 + 52.80 
= $127.80/hr, for EG computer, 


E(TC) 100 + 23.20 
$123.20/hr, for MBI computer. 


Consequently, the decision was made to lease the MBI computer. 


ll 


This example illustrates a case where the number of feasible values of u is finite 
but the value of s is fixed. If s were not fixed, a two-stage approach could be used 
to solve such a problem. First, for each individual value of 4, set C, = f(m), and 
solve for the value of s that minimizes E(TC) for model 1. Second, compare these 
minimum £(TC) for the alternative values of u, and select the one giving the overall 
minimum. 

When the number of feasible values of u is infinite (as when mechanically setting 
the speed of a machine or piece of equipment within some feasible interval), another 
two-stage approach sometimes can be used to solve the problem. First, for each 
individual value of s, analytically solve for the value of u that minimizes E(TC). 
[This approach requires setting to zero the derivative of E(TC) with respect to u and 
then solving this equation for u, which can be done only when analytical expressions 
are available for both f(u) and E(WC).] Second, compare these minimum E(TC) for 
the alternative values of s, and select the one giving the overall minimum. 

This analytical approach frequently is relatively straightforward for the case of 
s = 1 (see Prob. 14). However, because far fewer and less convenient analytical 
results are available for multiple-server versions of queueing models, this approach is 
either difficult (requiring computer calculations with numerical methods’ to solve the 
equation for u) or completely impossible when s > 1. Therefore, a more practical 
approach is to consider only a relatively small number of representative values of u 
and to use available tabulated results for the appropriate queueing model to obtain (or 
approximate) E(TC) for these pw. 

Fortunately, under certain fairly common circumstances described next, s = 1 
(and its minimizing value of u) must yield the overall minimum E(TC) for model 2, 
so s > 1 cases need not be considered at all. 


Optimality of a single server: Under certain conditions, s = 1 necessarily is 
optimal for model 2. 


The primary conditions! are 


1. The value of a minimizing E(TC) for s = 1 is feasible, 
2. f(m) is either a linear function or a concave function (as defined in Appen- 
dix 1). , 


In effect, this optimality result indicates that it is better to concentrate service capacity 
into one fast server rather than dispersing it among several slow servers. Condition 2 
says that this concentrating of a given amount of service capacity can be done without 
increasing the cost-of service. Condition 1 says that it must be possible to make u 
sufficiently large so that a single server can be used to full advantage. 


' There also are minor restrictions on the queueing model and the waiting-cost function. However, any of 
the constant service-rate queueing models presented in Chap. 16 for s = 1 are allowed. If the g(N) form 
is used for the waiting-cost function, it can be any increasing function. If the h(‘W) form is used, it can 
be any linear function or any convex function (as defined in Appendix 1), which fits most cases of interest. 
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Table 17.3 Comparison of Service Efficiency for Model 2 Solutions 







Mean Rate of Service Completions 
(s*, *) vs. (1, s**) for (s, u), where s* > 1 
0=0 
nt < s*u* 
Ste = s*u* 


To understand why this result holds, consider any other solution to model 2, 
(s, p) = (s*, p*), where s* > 1. The service capacity of this system (as. measured 
by the mean rate of service completions when all servers are working) would be s*u*. 
We shall now compare this solution with the corresponding single-server solution 
(s, p) = (1, s*p*) having the same service capacity. In particular, Table 17.3 com- 
pares the mean rate at which service completions occur for each given number of 
customers in the system N = n. Table 17.3 shows that the service efficiency of the 
(s*,.42*) solution sometimes is worse but never is better than for the (1, s**) solution 
because it can use the full service capacity only when there are at least s* customers 
in the system, whereas the single-server solution uses the full capacity whenever there 
are ány customers in the system. Because this lower service efficiency can only 
increase waiting in the system, E(WC) must be larger for (s*, w*) than (1, s*u*). 
Furthermore, the expected service cost must be at least as large because condition 2 
[and f(0) = 0J implies that 


s*f(u*) = ftp"). 


Therefore, E(TC) is larger for (s*, w*) than (1, s*u*). Finally, note that condition 
1 implies that there is a feasible solution with s = 1 that is at least as good as 
(1, s*y*). The conclusion is that any s > 1 solution cannot be optimal for model 2, 
so s = 1 must be optimal. 

This result is still of some use even when one or both conditions fail to hold. 
If u cannot be made sufficiently large to permit a single server, it still suggests that 
a few fast servers should be preferred to many slow ones. If condition 2 does not 
hold, we still know that E(WC) is minimized by concentrating any given amount of 
service capacity into a single server, so the best s = 1 solution must be at least nearly 
optimal unless it causes a substantial increase in service cost. 


Model 3— Unknown A and s 


Model 3 is designed especially for the case where it is necessary to select both the 
number of service facilities and the number of servers (s) at each facility. The typical 
situation would be where a population (such as the employees in an industrial building) 
must be provided with a certain service, so a decision must be made as to what 
proportion of the population (and therefore what value of A) should be assigned to 
each service facility. Examples of such facilities include employee facilities (drinking 
fountains, vending machines, and restrooms), storage facilities, and reproduction 
equipment facilities. It may sometimes be clear that only a single server should be 
provided at each facility (e.g., one drinking fountain or one copy machine), but s 
often is also a decision variable. 


1 For a rigorous proof of this result, see Stidham, Shaler, Jr.: ‘‘On the Optimality of Single-Server Queueing 
Systems,” Operations Research, 18:708—732, 1970. 


To simplify our presentation, we shall require in model 3 that A and s must be 
the same for all service facilities. However, it should be recognized that a slight 
improvement in the indicated solution might be achieved by permitting minor devia- 
tions in these parameters at individual facilities. This should be investigated as part 
of the detailed analysis that generally follows the application of the mathematical 
model. 


FORMULATION OF MODEL 3 


Definitions: C, = marginal cost of server per unit time. 


S 


C, = fixed cost of service per service facility per unit time. 
A, = mean arrival rate for entire population. 
n = number of service facilities = A,/A. 

Given: H, Cy, Cys Àp- 

To find: À, S. 


Objective: Minimize E(TC), subject to A = A,/n, wheren = 1,2,.... 


It might appear at first glance that the appropriate expression for the expected 
total cost per unit time of all the facilities should be 


E(TC) 2 n[(C; + sC,) + EWC), 


where E(WC) here represents the expected waiting cost per unit time for each facility. 
However, if this expression actually were valid, it would imply that n = 1 necessarily 
is optimal for model 3. [The reasoning is completely analogous to that for the opti- 
mality of a single-server result for model 2, namely, any solution (n, s) = (n*, s*) 
with n* > 1 has higher service costs than the (n, s) = (1, n*s*) solution, and it also 
has a higher expected waiting cost because it sometimes makes less effective use of 
the available service capacity. In particular, it sometimes has idle servers at one facility 
while customers are waiting at another facility, so the mean rate of service completions 
would be less than it would be if the customers had access to all the servers at one 
common facility.] Because there are many situations where it obviously would not be 
optimal to have just one service facility (e.g., the number of restrooms in the Pen- 
tagon), something must be wrong with this expression. Its deficiency is that it con- 
siders only the cost of service and the cost of waiting at the service facilities and 
totally ignores the cost of the time wasted in traveling to and from the facilities. 
Because travel time would be prohibitive with only one service facility for a large 
population, enough separate facilities must be distributed throughout the population 
to hold travel time down to a reasonable level. 

Thus, letting the random variable T be the round-trip travel time for a customer 
coming to and going back from one of the service facilities, the total time lost by the 
customer actually is (W + T). (Recall from Chap. 16 that W is the waiting time in 
the queueing system after the customer arrives.) Therefore, a customer’s total cost 
for time lost should be based on (W + T) rather than just W. To simplify the analysis, 
we shall separate this total cost into the sum of the waiting-time cost based on W (or 
N) and the travel-time cost based on T. We shall also assume that the travel-time cost 
is proportional to T, where C, is the cost of each unit of travel time for each customer. 
For ease of presentation, suppose that the probability distribution of T is the same for 
each service facility, so that C,E(T) is the expected travel cost for each arrival at any 
of the service facilities. The resulting expression for E(7C) would be 


E(TC) = anl(Cy + sC) + E(WC) + AC, E(T)] 
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because A is the expected number of arrivals per unit time at each facility. Conse- 
quently, if E(T) could be evaluated for each case of interest, model 3 can be solved 
by calculating E(TC) for various values of s for each n and then selecting the solution 
giving the overall minimum. The next section discusses how to evaluate E(T) and 
also solves an example (Example 3 of Sec. 17.1) fitting model 3. 


17.5 The Evaluation of Travel Time 


E(T) can be interpreted as the average travel time spent by customers in coming both 
to and from a given service facility. Therefore, the value of E(T) depends very much 
upon the characteristics of the individual situation. However, we shall illustrate a 
rather general approach to evaluating E(T) by developing a basic travel-time model 
and then calculating E(T) for a particular example involving a more complicated 
situation. In both cases it is assumed that the portion of the population assigned to 
the service facility under consideration is distributed uniformly throughout the assigned 
area, that each arrival returns to its original location after receiving service, and that 
the average speed of travel does not depend upon the distance traveled. Another basic 
assumption is that all travel is rectilinear, i.e., it progresses along a system of or- 
thogonal paths (aisles, streets, highways, and so on) that are parallel to the main sides 
of the area under consideration. 


A Basic Travel-Time Model 


Description: Rectangular area and rectilinear travel, as shown in Fig. 17.7. 
Definitions: T = travel time (round trip) for an arrival. 
v = average velocity (speed) of customers in traveling 
to and from facility. 
a, b, c, d = respective distances from facility to boundary of 
area assigned to facility, as shown in Fig. 17.7. 
Given: v, a, b, c, d. 
To find: Expected value of T, E(T). 


Using an orthogonal (x, y) coordinate system, Fig. 17.7 shows the coordinates 
(x, y) of the location of a particular customer. The x and y coordinates of the location 
from which a random arrival comes actually are random variables X and Y, where X 


(c, d) 





( —a, — b) 
Figure 17.7 Graphical representation of a basic travel-time model, where the service facility is at (0, 0) 
and a random arrival comes from (and returns to) some location (x, y): 


ranges from — a to c and Y ranges from — b to d. Because the total round-trip distance 673 


traveled by the random arrival is The Application of 
D = xxl E IYJ) Queueing Theory 
D 
and TSS 
U 
; 2 
it follows that E(T) = z Exh + E{\¥|p. 


Thus the problem is reduced to identifying the probability distributions of |X| and | Y| 
and then calculating their means. 

First consider |X|. Its probability distribution can be obtained directly from the 
distribution of X. Because the customers are assumed to be distributed uniformly 
throughout the assigned area, and because the height of the rectangular area is the 
same for all possible values of X = x, X must have a uniform distribution between 
—a and c, as shown in Fig. 17.8a. Because |x| = |—.x|, adding the probability density 
function values at x and —x then yields the probability distribution of |X| shown in 
Fig. 17.8b. 

Therefore, noting that |x| = x for x = 0, 


max{a, c} 


Eix} = I, xfx) dx 


min{a, c} max{a, c} 
2x x 
Í dx + | dx 
0 ate minfa, ch} a + c 


[(min{a, c)? + (max{a, c})7] 








li 


Il 





2a +e 
e+e 
Xa + cy 


The analysis for |¥| is completely analogous, where the width of the rectangular 
area for possible values of Y = y now determines the probability distribution of Y. 








fix) 2 
a+c 
Sx) ] 1 
a+c a+c 
—a 0 c x 0 max {a,c} x 
(a) (b) 


Figure 17.8 Probability density functions of (a) X; (b) |X]. 
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The result is that 
b? + d? 
2(b + dy 


lf@te +d? 
Consequently, ET) = 5 (F + urA ). 


EY} = 





EXAMPLE 3—How Many Toot Criss? For the new plant being designed for the 
Mechanical Company (see Sec. 17.1), the layout of the portion of the factory area 
where the mechanics will work is shown in Fig. 17.9. The three possible locations 
for tool cribs are identified as Locations 1, 2, and 3, where access to these locations 
will be provided by a system of orthogonal aisles parallel to the sides of the indicated 
area. The coordinates are given in units of feet. The mechanics will be distributed 
quite uniformly throughout the area shown, and each mechanic will be assigned to 
the nearest tool crib. It is estimated that the mechanics will walk to and from a tool 
crib at an average speed of slightly less than 3 miles/hour, so v is set atv = 15,000 
feet/hour. 
The three basic alternatives being considered are 


Alternative 1: Have three tool cribs—use Locations 1, 2, and 3; 
Alternative 2: Have one tool crib—use Location 2; 
Alternative 3: Have two tool cribs—use Locations 1 and 3. 


The calculation of E(T) for each alternative is given next, followed by the use of 
model 3 to make the choice among them. 


(300, 600) (600, 600) 






Location 3 
a 
(450, 450) 





(0, 300) 





Location 1 Location 2 


a 
(150, 150) 





a 
(450, 150) 





(0, 0) (600, 0) 
Figure 17.9 Layout for Example 3. 


Alternative I (n = 3): If all three locations were used, each tool crib would service 675 
a 300 x 300 foot square area. Therefore, this case is just a special case of the basic 
travel-time model just presented, where a = c = 150 and b = d = 150. Conse- 
quently, 
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Baie 1 1507 + 150? 150? + 150? 
~ 15,000 ft/hr \ 150 + 150 150 + 150 
a Ae (300 ft) 
~ 15,000 ft/hr 
= 0.02 hr. 


Alternative 2 (n = 1): With just one tool crib (in Location 2) to service the entire 
area shown in Fig. 17.9, the derivation of E(T) is a little more complicated than it is 
for the basic travel-time model. The first step is to relabel Location 2 as the origin 
(0, 0) for an (x, y) coordinate system, so that 450 would be subtracted from the first 
coordinates shown and 150 would be subtracted from the second coordinates. The 
probability density function for X is then obtained by dividing the height for each 
possible value of X = x by the total area (so that the area under the probability density 
function curve equals 1), as given in Fig. 17.10a. Combining the values for x and 
—x then yields the probability distribution of |X| shown in Fig. 17.10b. 
Hence 


450 
EX\x|} = Í xfx) dx 


150 1 450 1 
= Í x (5) dx + la x (35) dx 
150? 4507 — 150? 


=o 1800, ~ O 





Sixi(x) ee 





— 450 —150 0 150 x O 150 450 


(a) (6) 


Figure 17.10 Probability density functions of (a) X and (b) |X| for a tool crib at Location 2 of Fig. 17.9 
under Alternative 2 (no other tool cribs). 
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We suggest that you now try the same approach (using the width of the area 
rather than the height) to derive E{|Y|}. You will find that the probability distribution 
of |Y] is identical to that for |X|, so E{|¥|} = 150. As a result, 





E(T) = (150 + 150) 


2 
15,000 
0.04 hr. 


Alternative 3 (n = 2): With tool cribs in just Locations 1 and 3, the areas assigned 
to them would be divided by a line segment between (300, 300) and (600, 0) in Fig. 
17.9. Notice that the two areas and their tool cribs are located symmetrically with 
respect to this line segment. Therefore, E(T) is the same for both, so we shall derive 
it just for the tool crib in Location 1. (You might try it for the other tool crib for 
practice—see Prob. 18.) 

Proceeding just as for Alternative 2, relabel Location 1 as the origin (0, 0) for 
an (x, y) coordinate system, so that 150 would be subtracted from all coordinates 
shown in Fig. 17.9. This relabeling leads directly to the probability density function 
of X, and then of |X|, shown in Fig. 17.11. As a result, 


1 150 450 x 
EXI} z725 Jo rdx + og o (1 - 2) xa 


EN EA E 
225| 2]  300|2 1,350] 


_ 1 150? (2 S) ES (2 sE) 











2325 2 * 300\ 2 1,350) 30 \ 2 1,350 
= 1334. 


Next, the probability density function of Y is obtained by using the width of the 
area assigned to the tool crib at Location 1 for each possible value of Y = y and then 
dividing by the size of the area, as given in Fig. 17.12a. This result then yields the 


fixa) 





—150 0 150 450 x 0 150 450 x 


Figure 17.11 Probability density furictions of (a) X and (b) |X| for a tool crib at Location 1 of Fig. 17.9 
under Alternative 3 (the only other tool crib is at Location 3). 
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fir) 15 


© 





—150 0 150 0 150 © 
(a) () 


Figure 17.12 Probability density functions of (a) Y and (b) |Y] for a tool crib at Location 1 of Fig. 17.9 
under Alternative 3 (the only other tool crib is at Location 3). 


uniform distribution of |¥| shown in Fig. 17.12b. Thus 
150 


y}=— 
FY = TJ. VS 
= 75. 
è 2 i 
Consequently, ET) = F300 (1338 + 75) 
= 0.0278 hr. 


Applying Model 3: Because E(T) now has been evaluated for the three alternatives 
under consideration, the stage is set for using model 3 from Sec. 17.4 to choose 
among these alternatives. Most of the data required for this model are given in Sec. 
17.1, namely, 


Il 


120/hr, C; = $8/hr, 
C, = $10/hr, 
A, = 120/hr, C, = $24/hr, 


where the M/M/s model given in Sec. 16.6 would be used to calculate L, and so on. 
In addition, the end of Sec. 17.3 gives E(WC) = 24L in dollars per hour. Therefore, 


H 


i 


ETC) =n E + 10s) + 24L + Baan], 
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Table 17.4 Calculation of E(TC) in $/hr for Example 3 


AC ET) 

















The resulting calculation of E(TC) for various s for each n is given in Table 17.4, 
which indicates that the overall minimum E(TC) is obtained by having three tool cribs 
(so A = 40 for each), with one clerk at each tool crib. 


17.6 Conclusions 


This chapter has discussed the application of queueing theory for designing queueing 
systems. Every individual problem has its own special characteristics, so no standard 
procedure can be prescribed to fit every situation. Therefore, the emphasis has been 
on introducing fundamental considerations and approaches that can be adapted to most 
cases. We have focused on three particularly common decision variables (s, u, and 
A) as a vehicle for introducing and illustrating these concepts. However, there are 
many other possible decision variables (e.g., the size of a waiting room for a queueing 
system) and many more complicated situations (e.g., designing a priority queueing 
system) that can also be analyzed in a similar way. 

The time required to travel to and from a service facility sometimes is an 
important consideration. A rather general approach to evaluating expected travel time 
has been introduced by applying it to some relatively simple cases. However, once 
again, many more complicated situations can also be analyzed quite similarly. We 
have discussed the incorporation of travel-time information into the overall analysis 
only in the context of determining the number of service facilities to provide when 
customers must travel to the nearest facility. But travel-time models also can be very 
useful when the servers must travel to the customer from the service facility (e.g., 
fire trucks and ambulances), as well as in other contexts. 

Another useful area for the application of queueing theory is the development 
of policies for controlling queueing systems, e.g., for dynamically adjusting the num- 
ber of servers or the service rate to compensate for changes in the number of customers 
in the system. Considerable research is being conducted in this area. 

Queueing theory has proven to be a very useful tool, and we anticipate that its 
use will continue to grow as recognition of the many guises of queueing systems 
grows. 
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PROBLEMS 


1. For each kind of queueing system listed in Chap. 16, Prob. 2, briefly describe the 
nature of the cost of service and the cost of waiting that would need to be considered in designing 
the system. 


2.* Suppose that a queueing system fits the M/M/1 model described in Sec. 16.6, with 
A = 2 and u = 4. Evaluate the expected waiting cost per unit time E(WC) for this system 
when its waiting-cost function has the form 

(a) g(N) = 10N + 20°. 


10N, for N = 0,1,2 
(b) g(N) = 4607, for N = 3, 4,5 
N°, for N > 5. 


(O ACW) = 25W + W. 


W, fros W s1 
CRE e for W = 1. 


3. A certain queueing system has a Poisson input, with a mean arrival rate of four 
customers per hour. The service-time distribution is exponential, with a mean of 0.2 hour. The 
marginal cost of providing each server is $20/hour, where it is estimated that the cost that is 
incurred by having each customer idle (i.e., in the queueing system) is $120/hour for the first 
customer and $180/hour for each additional customer. Determine the number of servers that 
should be assigned to the system to minimize the expected total cost per hour. [Hint: Express 
E(WC) in terms of L, P, and p, and then use Figs. 16.6 and 16.7.] 


4. A certain small grocery store has a single checkout stand with a full-time cashier. 
Customers arrive at the stand according to a Poisson process at a mean rate of 30 per hour. 
The service-time distribution is exponential, with a mean of 1.5 minutes. This situation has 
resulted in occasional long lines and complaints from customers. Therefore, because there is 
no room for a second checkout stand, the proposal has been made that another person be hired 
to help the cashier by bagging the groceries. This help would reduce the expected time required 
to process a customer to 1 minute, but the distribution still would be exponential. 

The total compensation for the new employee would be $5 per hour, which is just half 
that for the cashier. It is estimated that the grocery store incurs lost profit due to lost future 
business of 5¢ for each minute that each customer has to wait (including service time). The 
manager wants to determine on an expected total cost basis whether it would be worthwhile to 
hire the new person. 

(a) Which decision model presented in Sec. 17.4 applies to this problem? Why? 

(b) Use this model to determine whether to continue the status quo or to adopt the 

proposal. 


5. The problem is to choose between two types of materials-handling equipment, A and 
B, for transporting certain types of goods between certain producing centers in a job shop. 
Calls for the materials-handling unit to move a load would come essentially at random (i.e., 
according to a Poisson input process) at a mean rate of four per hour. The total time required 
to move a load has an exponential distribution, where the expected time is 12 minutes for A 
and 9 minutes for B. The total equivalent uniform hourly cost (capital recovery cost plus 
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operating cost) would. be $50 for A and $150 for B. The estimated cost of idle goods (waiting 
to be moved or in transit) because of increased in-process inventory is $20/load/hour. Fur- 
thermore, the scheduling of the work at the producing centers allows for just 1 hour from the 
completion of a load at one center to the arrival of that load at the next center. Therefore, an 
additional $100/load/hour of delay (including transit time) after the first hour is to be charged 
for lost production because of idle personnel and equipment, extra costs of expediting and 
supervision, and so forth. 

Assuming that only one materials-handling. unit is to be purchased, which type of unit 
should be selected? 


6. A railroad company paints its own railroad cars as needed. Alternative 1 is to provide 
two paint shops, where painting is done by hand (one car at a time in each shop), for a total 
annual cost of $300,000. The painting time for a car is 6 hours. Alternative 2 is to provide one 
spray shop involving an annual cost of $400,000. In this case, the painting time for a car (again 
done one at a time) is 3 hours. For both alternatives, the cars arrive according to a Poisson 
input, with a mean arrival rate of one every 5 hours. The cost of idle time per car is $50/hour. 
Which alternative should the railroad choose? Assume that the paint shops are always open; 
i.e., they work (24)(365) = 8,760 hours per year. 


7. An airline maintenance base wants to make a change in its overhaul operation. The 
present situation is that only one airplane can be repaired at a time, and the expected repair 
time is 36 hours, whereas the expected time between arrivals is 45 hours. This situation has 
led to frequent and prolonged delays in repairing incoming planes, even though the base operates 
continuously. The average cost of an idle plane to the airline is $3,000/hour. It is estimated 
that each plane goes into the maintenance shop five times per year. It is believed that the input 
process for the base is essentially Poisson and that the probability distribution of repair times 
is Erlang, with shape parameter k = 2. Alternative A is to provide a duplicate maintenance 
shop, so that two planes can be repaired simultaneously. The cost, amortized over a period of 
5 years, is $400,000/airplane/year. 

Alternative B is to replace the present maintenance equipment by the most efficient (and 
expensive) equipment available, thereby reducing the expected repair time to 18 hours. The 
cost, amortized over a period of 5 years, is $550,000/airplane/year. 

Which alternative should the airline choose? 


8.* A particular in-process inspection station is used to inspect subassemblies of a certain 
kind. At present there are two inspectors at the station, and they work together to inspect each 
subassembly. The inspection time has an exponential distribution, with a mean of 15 minutes. 
The cost of providing this inspection system is $20/hour. 

A proposal has been made to streamline the inspection procedure so that it can be handled 
by only one inspector. This inspector would begin by visually inspecting the exterior of the 
subassembly, and he would then use new efficient equipment to complete the inspection. The 
times required for these two phases of the inspection have independent Erlang distributions, 
with shape parameter k = 2 and means of 6 and.12 minutes, respectively. The capitalized cost 
of providing this inspection system would be $15/hour. 

The subassemblies arrive at the inspection station according to a Poisson process at a 
mean rate of three per hour. The cost of having the subassemblies wait at the inspection station 
(thereby increasing in-process inventory and disrupting subsequent production) is estimated to 
be $10/hour for each subassembly. 

Determine whether to continue the status quo or adopt the proposal in order to minimize 
expected total cost per hour. 


9. A car rental agency has been subcontracting for the maintenance and repair of its 
cars. However, due to long delays in getting its cars back, the agency has decided to open its 
own maintenance shop to do this work more quickly. This shop will operate 42 hours per week. 

Alternative 1 is to hire two mechanics (at a cost of $1,000/week each), so that two cars 


can be worked on at a time. The time required by a mechanic to service a car would have an 
exponential distribution with a mean of 5 hours. 

Alternative 2 is to hire just one mechanic but to provide some additional special equip- 
ment (at a capitalized cost of $500/week) to speed up his work. In this case, the maintenance 
work on each car would be done in stages, where the time required for each stage has an Erlang 
distribution with the shape parameter k = 4, where the mean is 2 hours for the first stage and 
1 hour for the second stage. 

For both alternatives, the cars would arrive according to a Poisson process, with a mean 
arrival rate of 0.3 car per hour (during work hours). The agency estimates that its net lost 
revenue due to having its cars unavailable for rental is $100/week/car. 

Which alternative should the agency choose to minimize expected total cost per week? 


10. A certain small car-wash business is currently being analyzed to see if costs can be 
reduced. Customers arrive according to a Poisson process at a mean rate of 15 per hour, and 
only one car can be washed at a time. At present the time required to wash a car has an 
exponential distribution, with a mean of 4 minutes. It also has been noticed that if there are 
already four cars waiting (including the one being washed), then any additional arriving cus- 
tomers leave and take their business elsewhere. The lost incremental profit from each such lost 
customer is $3. 

Two proposals have been made. Proposal 1 is to add certain equipment, at a capitalized 
cost of $3/hour, which would reduce the expected washing time to 3 minutes. In addition, 
each arriving customer would be given a guarantee that if she has to wait longer than 4 hour 
(according to a time slip she receives upon arrival) before her car is ready, then she receives 
a free car wash (at a marginal cost of $2 for the company). This guarantee would be well posted 
and advertised, so it is believed that no arriving customers would be lost. 

Proposal 2 is to obtain the most advanced equipment available, at an increased cost of 
$10/hour, where each car would be sent through two cycles of the process in succession. The 
time required for a cycle has an exponential distribution, with a mean of 1 minute, so total 
expected washing time would be 2 minutes. Because of the increased speed and effectiveness, 
it is believed that essentially no arriving customers would be lost. 

The owner also feels that because of the loss of customer goodwill (and consequent lost 
future business) when customers have to wait, a cost of 10 cents for each minute that a customer 
has to wait before her car wash begins should be included in the analysis of all alternatives. 

Evaluate the expected total cost per hour E(TC) of the status quo, proposal 1, and proposal 
2 to determine which one should be chosen. 


11.* A single crew is provided for unloading and/or loading each truck that arrives at 
the loading dock of a warehouse. These trucks arrive according to a Poisson input process at 
a mean rate of one per hour. The time required by a crew to unload and/or load a truck has 
an exponential distribution (regardless of the crew size). The expected time required by a one- 
person crew would be 1 hour. 

The cost of providing each additional member of the crew is $10/hour. The cost that is 
attributable to having a truck not in use (i.e., a truck standing at the loading dock) is estimated 
to be $15/hour. 

(a) Assume that the mean service rate of the crew is proportional to its size. What 

should the size be to minimize the expected total cost per hour? 

(b) Assume that the mean service rate of the crew is proportional to the square root of 

its size. What should the size be to minimize expected total cost per hour? 

12. Reconsider Prob. 31, Chap. 16. Suppose that the waiting cost for airplanes at the 
maintenance base depends linearly on the number in the system with a cost of $90,000/week/ 
airplane. Suppose that the service rate is continuously adjustable with the cost per week of 
providing service rate u given by the expression 

40,000 + 10,000 u. 
Find the optimal value of u. 
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13.* A machine shop contains a grinder for sharpening the machine cutting tools. A 
decision must now be made on the speed at which to set the grinder. 

The grinding time required by a machine operator to: sharpen his cutting tool has an 
exponential distribution, where the mean 1/u can be set at anything from 4 minute to 2 minutes, 
depending upon the speed of the grinder. The running and maintenance costs go up rapidly 
with the speed of the grinder, so the estimated cost per minute for providing a mean of 1/, is 
$(0.1047). 

The machine operators arrive to sharpen their tools according to a Poisson process at a 
mean rate of one every 2 minutes. The estimated cost of an operator being away from his 
machine to the grinder is $0.20/minute. 

Plot the expected total cost per minute E(TC) versus w over the feasible range for m to 
solve graphically for the minimizing value of jw. 


14. Consider the special case of model 2 where (1) any > A/s is feasible and (2) 
both f(u) and the waiting-cost function are linear functions, so that 


EC) = sCu + C,,L, 


where C, is the marginal cost per unit time for each unit of a server’s mean service rate and 
C,, is the cost of waiting per unit time for each customer. The optimal solution is s = 1 (by 
the optimality of a single-server result), and 


AC 
=v~+ | 
es C 


for any queueing system fitting the M/M/1 model presented in Sec. 16.6. 
Show that this u is indeed optimal for the M/M/1 model. 


15. Consider a harbor with a single dock for unloading ships. The ships arrive according 
to a Poisson process at a mean rate of A ships per week, and the service-time distribution is 
exponential with a mean rate of u unloadings per week. Assume that harbor facilities are owned 
by the shipping company, so that the objective is to balance the cost associated with idle ships 
with the cost of running the dock. The shipping company has no control over the arrival rate 
A (i.e., A is fixed); however, by changing the size of the unloading crew, and so on, the 
shipping company can adjust the value of u as desired. 

Suppose that the expected cost per unit time of running the unloading dock is D- p. 
The waiting cost for each idle ship is some constant (C) times the square of the total waiting 
time (including loading time). The shipping company wishes to adjust u so that the expected 
total cost (including the waiting cost for idle ships) per unit time is minimized. Derive this 
optimal value of u in terms of D and C. 


16. Reconsider Prob. 24 in Chap. 16. 

(a) Formulate part (a) to fit as closely as possible a special case of one of the decision 
models presented in Sec. 17.4. (Do not solve.) 

(b) Because the answer for part (b) reveals a very low utilization of the machine op- 
erators (who are relatively expensive employees), the department manager has asked 
you to analyze each of the following alternative ways of organizing the work of the 
operators: (i) pool the operators so that any idle operator can take the next machine 
needing servicing, (ii) combine the operators into small crews to work together on 
any machine needing servicing within their assigned group of machines, and (iii) 
keep each operator assigned to her own group of machines but allow idle operators 
to assist busy operators on their machines. Describe each of these alternatives in 


queueing theory terms, including their relationship (if any) to the decision models 
presented in Sec. 17.4. Briefly indicate why each of these alternatives might decrease 
the total number of operators (thereby increasing their utilization) needed to achieve 
the required production rate. Also point out any dangers that might prevent this 
decrease. 


17. Consider a factory whose floor area is a square with 600 feet on each side. Suppose 
that one service facility of a certain kind is provided in the center of the factory. The employees 
are distributed uniformly throughout the factory, and they walk to and from the facility at an 
average speed of 3 miles/hour along a system of orthogonal aisles. 

Compute the expected travel time E(T) per arrival. 


18. Consider Alternative 3 (tool cribs in Locations 1 and 3) for the example illustrated 


in Fig. 17.9. Derive E(T) for the tool crib in Location 3 by using the probability density . 


functions of X and Y directly for this tool crib. 


19.* Suppose that the calling population for a particular service facility is uniformly 
distributed over each area shown, where the service facility is located at (0, 0). Making the 
same assumptions as in Sec. 17.5, derive the expected round-trip travel time per arrival E(T) 
in terms of the average velocity v and the distance r. 


(a) (r, 2r) (5r, 2r) 


(#7) 


(=r, =r) 


(5r, — 2r) 


(b) 
(—3r,r) 


(37, r) 


(=3r, =r) Gr, =r) 
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(—r,0) (2r, 0) 


(2r, —r) 





(—2r, 3r) (0, 3r) 


(2r, 3r) 





—4r. — 
(—4r, =r) Oo 


(0, — 3r) (2r, — 3r) 

20. A certain large shop doing light fabrication work uses a single central storage facility 
(dispatch station) for material in in-process storage. The typical procedure is that each employee 
personally delivers her finished work (by hand, tote box, or hand cart) and receives new work 
and materials at the facility. Although this procedure had worked well in earlier years when 
the shop was smaller, it appears that it may now be advisable to divide the shop into two semi- 
independent parts, with a separate storage facility for each one. You have been assigned the 
job of comparing the use of two facilities and of one facility from a cost standpoint. 

The factory has the shape of a rectangle 150 by 100 yards. Thus, letting 1 yard be the 
unit of distance, the (x, y) coordinates of the corners would be (0, 0), (150, 0), (150, 100), 
and (0, 100). With this coordinate system, the existing facility is located at (50, 50), and the 
location available for the second facility is (100, 50). 

Each facility would be operated by a single clerk. The time required by a clerk to service 
a caller has an exponential distribution, with a mean of 2 minutes. Employees arrive at the 
present facility according to a Poisson input process at a mean rate of 24 per hour. The 
employees are rather uniformly distributed throughout the shop, and if the second facility were 
installed, each employee would normally use the nearer of the two facilities. Employees walk 
at an average speed of about 5,000 yards per hour. All aisles are parallel to the outer walls of 
the shop. The net cost of providing each facility is estimated to be about $20/hour, plus 


$15/hour for the clerk. The estimated total cost of an employee being idled by traveling or 
waiting at the facility is $25/hour. 
Given the preceding cost factors, which alternative minimizes expected total cost? 


21. A job shop is being laid out in a square area with 600 feet on a side, and one of 
the decisions to be made is the number of facilities for the storage and shipping of final 
inventory. The capitalized cost associated with providing each facility would be $10/hour. 
There are just four potential locations available for these facilities. one in the middle of each 
of the four sides of the square area as shown in the figure. 


The loads to be moved to a storage and shipping facility would be distributed uniformly 
throughout the shop area, and they become available according to a Poisson process at a mean 
rate of 90 per hour. Each time a load becomes available, an appropriate materials-handling 
vehicle would be sent from the nearest facility to pick it up (with an expected loading time of 
3 minutes) and bring it there, where the cost would be $40/hour for time spent in traveling, 
loading, and waiting to be unloaded. The vehicles would travel at a speed of 20,000 feet per 
hour along a system of orthogonal aisles parallel to the sides of the shop area. 

Another decision to be made is the number of men (m) to provide at each storage and 
shipping facility for unloading an arriving vehicle. These m men would work together on each 
vehicle, and the time required to unload it would have an exponential distribution, with a mean 
of 2/m minutes. The cost of providing each man is $15/hour. 

Determine the number of facilities and the value of m at each that will minimize expected 
total cost per hour. 


22.* Consider the formulation of the County Hospital emergency room problem as a 
preemptive priority queueing system, as presented in Sec. 16.8. Suppose that the following 
imputed costs are assigned to making patients wait (excluding treatment time): $10/hour for 
stable cases, $1,000/hour for serious cases, and $100,000/hour for critical cases. The cost 
associated with having an additional doctor on duty would be $40/hour. Determine on an 
expected total cost basis whether there should be one or two doctors on duty. 


23. A certain job shop has been experiencing long delays in jobs going through the 
turret lathe department because of inadequate capacity. The foreman contends that five machines 
are required, as opposed to the three machines that he now has. However, because of pressure 
from management to hold down capital expenditures, only one additional machine will be 
authorized unless there is solid evidence that a second one is necessary. 

This shop does three kinds of jobs, namely, government jobs, commercial jobs, and 
standard products. Whenever a turret lathe finishes a job, it starts a government job if one is 
waiting: if not, it starts a commercial job if any are waiting; if not, it starts on a standard 
product if any are waiting. Jobs of the same type are taken on a first-come-first-served basis. 

Although much overtime work is required currently, management wants the turret lathe 
department to operate on an 8-hour, 5-day-a-week basis. The probability distribution of the 
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time required by a turret lathe for a job appears to be approximately exponential, with a mean 
of 10 hours. Jobs come into the shop according to a Poisson input process, but at a mean rate 
of 6 per week for government jobs, 4 per week for commercial jobs, and 2 per week for 
standard products. (These figures are expected to remain the same for the indefinite future. ) 

It is worth about $750, $450, and $150 to avoid a delay of one additional (working) day 
in a government, commercial, and standard job, respectively. The incremental capitalized cost 
of providing each turret lathe (including the operator and so on) is estimated to be $250/working 
day. 

Determine the number of additional turret lathes that should be obtained to minimize 
expected total cost. 


I3 


Inventory Theory 


18.1 Introduction 


Keeping an inventory (stock of goods) for future sale or use is very common in 
business. Retail firms, wholesalers, manufacturing companies—and even blood 
banks — generally have a stock of goods on hand. How does such a facility decide 
upon its “inventory policy’’; i.e., when and how much does it replenish? In a small 
firm the manager may keep track of inventory and make these decisions. However, 
since this may not be feasible even in small firms, many companies have saved large 
sums of money by using “‘scientific inventory management.” In particular, they: 


1. Formulate a mathematical model describing the behavior of the inventory 
system. 

2. Derive an optimal inventory policy with respect to this model. 

3. Frequently use a computer to maintain a record of the inventory levels and 
to signal when and how much to replenish. 
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There are several basic considerations involved in determining an inventory policy 
that must be reflected in the mathematical inventory model; these are illustrated in the 
following examples. 


EXAMPLE 1: A television manufacturing company produces its own speakers, which 
are used in the production of its television sets. The television sets are assembled on 
a continuous production line at a rate of 8,000 per month. The speakers are produced 
in batches because they do not warrant setting up a continuous production line, and 
relatively large quantities can be produced in a short time. The company is interested 
in determining when and how many to produce. Several costs must be considered: 


1. Each time a batch is produced, a setup cost of $12,000 is incurred. This cost 
includes the cost of *‘tooling up,’ administrative costs, record keeping, and 
so forth. Note that the existence of this cost argues for producing speakers 
in large batches. 

2. The production of speakers in large batches leads to a large inventory. The 
estimated cost of keeping a speaker in stock is 30 cents/month. This cost 
includes the cost of capital tied up, storage space, insurance, taxes, protec- 
tion, and so on. The existence of a storage or holding cost argues for pro- 
ducing small batches. 

3. The production cost of a single speaker (excluding the setup cost) is $10 and 
can be assumed to be a unit cost independent of the batch size produced. 
(In general, however, the unit production cost need not be constant and may 
decrease with batch size.) 

4. Company policy prohibits deliberately planning for shortages of any of its 
components. However, a shortage of speakers occasionally crops up, and it 
has been estimated that each speaker that is not available when required costs 
$1.10/month. This cost includes the cost of installing speakers after the 
television set is fully assembled, storage space, delayed revenue, record 
keeping, and so forth. 


EXAMPLE 2: A wholesale distributor of bicycles is having trouble with shortages 
of the most popular inexpensive 10-speed model and is currently reviewing the in- 
ventory policy for this model. The distributor purchases this model bicycle from the 
manufacturer monthly and then supplies them to various bicycle shops in the western 
United States. Upon request from shops, the distributor wholesales bicycles to the 
individual shops in its region. The distributor has analyzed his costs and has deter- 
mined that the following are important: 


1. The shortage cost, i.e., the cost of not having a bicycle on hand when needed. 
Most models are easily reordered from the manufacturer, and stores usually 
accept a delay in delivery. Still, although shortages are permissible, the 
distributor feels that he incurs a loss, which he estimates to be $15 per 
bicycle. This cost represents an evaluation of the cost of the loss of goodwill, 
additional clerical costs incurred, and the cost of the delay in revenue re- 
ceived. On a very few competitive (in price) models, stores do not accept a 
delay, which results in lost sales. In this case, the cost of lost revenue must 
be included in the shortage cost. 

2. The holding cost, i.e., the cost of maintaining an inventory, is $1 per bicycle 
remaining at the end of the month. This cost represents the costs of capital 
tied up, warehouse space, insurance, taxes, and so on. 


3. The ordering cost, i.e., the cost of placing an order plus the cost of the 
bicycle, consists of two components: The paperwork involved in placing an 
order is estimated as $200, and the actual cost of a bicycle is $35. 


These two examples indicate that there exists a trade-off between the costs involved; 
the next section discusses the basic cost components of inventory models. 


18.2 Components of Inventory Models 


Because inventory policies obviously affect profitability, the choice among policies 
depends upon their relative profitability. Some of the costs that determine this prof- 
itability are (1) the costs of ordering or manufacturing, (2) holding or storage costs, 
(3) unsatisfied demand or shortage penalty costs, (4) revenues, (5) salvage costs, and 
(6) discount rates. [Costs (1) to (3) were encountered in Examples 1 and 2.] 

The cost of ordering or manufacturing an amount z can be represented by a 
function c(z). The simplest form of this function is one that is directly proportional 
to the amount ordered, that is, c - z, where c represents the unit price paid. Another 
common assumption is that c(z) is composed of two parts: a term that is directly 
proportional to the amount ordered and a term that is constant K for z positive and 
zero for z = 0. For this case, if z is positive, the ordering, or production cost, is 
given by K + c- z. The constant K is often referred to as the setup cost and generally 
includes the administrative cost of ordering, the preliminary labor, and other expenses 
of starting a production run. There are other assumptions that can be made about this 
ordering function, but this chapter is restricted to the two cases just described. In 
Example 1, the speakers are manufactured, and the setup cost for the production run 
is $12,000. Furthermore, each speaker costs $10, so that the production cost is given 
by 


e(z) = 12,000 + 10z, for z > 0. 


In Example 2, the distributor orders bicycles from the manufacturer, and the ordering 
cost is given by 


c(z) = 200 + 35z, for z > 0. 


The holding or storage costs represent the costs associated with the storage of 
the inventory until it is sold or used. They may include the cost of capital tied up, 
space, insurance, protection, and taxes attributed to storage. These costs may be a 
function of the maximum quantity held during a period, the average amount held, or 
the cumulated excess of supply over the amount required (demand). The latter view- 
point is usually taken in this chapter. In the bicycle example, the holding cost was 
$1 per bicycle remaining at the end of the month. This cost can be interpreted as the 
interest lost in keeping capital tied up in an ‘‘unnecessary’’ bicycle for a month, cost 
of extra storage space, insurance, and so forth. 

The unsatisfied demand or shortage penalty cost is incurred when the amount 
of the commodity required (demand) exceeds the available stock. This cost depends 
upon the structure of the model. One such case occurs when the demand exceeds the 
available inventory, and (1) it is met by a priority shipment, or (2) it is not met at 
all. In (1) the penalty cost can be viewed as the entire cost of the priority shipment 
that is used to meet the excess demand. In (2), the situation where the unsatisfied 
demand is lost, the penalty cost can be viewed as the loss in revenue. Either situation 
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is known as ‘‘no backlogging of unsatisfied demand.” The scenario of the bicycle 
example implies that there exist a few competitive (in price) bicycle models where 
unsatisfied demand is lost, thereby resulting in lost revenue, and hence the example 
is one where unsatisfied demand is not backlogged. The second case of demand not 
being fulfilled out of stock assumes that it is satisfied when the commodity next 
becomes available. The penalty cost can be interpreted as the loss of customers’ 
goodwill, their subsequent reluctance to do business with the firm, the cost of delayed 
revenue, and extra record keeping. This case is known as ‘‘backlogging of unsatisfied 
demand.’’ The speaker example calls. for backlogging of unsatisfied demand. If a 
shortage occurs, the final assembly of the television set awaits the production of the 
next batch of speakers. Usually the unsatisfied demand cost is a function of the excess 
of demand over supply. 

The revenue cost may or may not be included in the ete: If it is assumed 
that both the price and the demand for the product are not under the control of the 
company, the revenue from sales. is independent of the firm’s inventory policy and 
may be neglected. However, if revenue is neglected in the model, the loss in revenue 
must then be included in the unsatisfied demand penalty cost whenever the firm cannot 
meet the demand and the sale is lost. This point is discussed at length in the 10-speed 
bicycles example presented on page 707. Furthermore, even in the case where demand 
is. backlogged, the cost of the delay in revenue must also be included in the unsatisfied 
demand cost. With these interpretations, revenue will not: be considered as a separate 
cost in the remainder of this chapter. 

The salvage value of an item is the value of a leftover item at the termination 
of the inventory period. If the inventory policy is carried on for an indefinite number 
of periods, and if there is no obsolescence, there are no leftover items. What is left 
over at the end of one period is the amount available at the. beginning of the next 
period. On the other hand, if the. policy is to be carried out for only one period, the 
salvage value represents the disposal value of the item to the firm, say, the selling 
price. The negative of the salvage value is called the salvage cost. If there is a cost 
associated with the disposal of an item, the salvage cost may be positive. Because 
the storage costs generally are assumed to be a function of excess of supply over 
demand, the salvage costs can be combined with this cost and hence they are usually 
neglected in this chapter. 

Finally, the discount rate takes into account the time value of money. When a 
firm ties up capital in inventory, it is prevented from using this money for alternative 
purposes. For example, it could invest. this money in: secure investments, say, gov- 
ernment bonds, and have a return on investment a year hence of, say, 7 percent. Thus 
a dollar invested today would be worth $1.07 a year hence, or alternatively, a dollar 
profit a year hence is equivalent to a = 1/$1.07 today. The quantity œ is known as 
the discount factor. Thus, in considering the profitability of an inventory policy, the 
profit or costs a year hence should be multiplied by a; 2 years hence, by a”; and so 
on. 

Of course, the convention of choosing a discount factor œ that is based upon 
the current value of a dollar delivered one year hence is arbitrary, and any time period 
could have been used, for example, one month. It is also evident that in problems 
having short time horizons, œ may be assumed to be 1 (and thereby neglected) because 
the current value of a dollar delivered during this short time horizon does not change 
very much. However, in problems having long time horizons, the discount factor must 
be included. 


In using quantitative techniques to seek optimal inventory policies, we use the 
criterion of minimizing the total (expected) discounted cost. Under the assumptions 
that the price and demand for the product are not under the control of the company 
and that the lost or delayed revenue is included in the shortage penalty cost, mini- 
mizing cost is equivalent to maximizing net income. Another criterion to be consid- 
ered, although it is nonquantitative but nevertheless important in practice, is that the 
resultant inventory policy be simple; i.e., the rule for indicating when to order and 
how much to order must be able to be easily described. Most of the policies considered 
possess this property. 

Inventory models are usually classified according to whether the demand for a 
period is known (deterministic demand) or whether it is a random variable having a 
known probability distribution (nondeterministic or random demand). The production 
of batches of speakers is an example of deterministic demand because it is assumed 
that they are used in television assemblies at a rate of 8,000 per month. The bicycle 
shops’ purchases of bicycles from the distributor is an example of random demand. 
This classification is frequently coupled with whether or not there exist time lags in 
the delivery of the items ordered or produced. In both the speaker and the bicycle 
examples, there was an implication that the items appeared immediately after an order 
was placed. In fact, the production of speakers may require some time, and similarly, 
the delivery of bicycles may not be instantaneous, so that time lags may have to be 
incorporated into the inventory model. 

Another possible classification relates to the way the inventory is reviewed, 
either continuously or periodically. In continuous review, an order is placed as soon 
as the stock level falls below the prescribed reorder point, whereas in the periodic 
review case, the inventory level is checked at discrete intervals, e.g., at the end of 
each week, and ordering decisions are made only at these times even if the inventory 
level dips below the reorder point during the preceding period. The production of 
speakers is an example of a continuous review, whereas the bicycle problem is an 
example of periodic review. Incidentally, in practice, a periodic review policy can be 
used to approximate a continuous review policy by making the time interval suffi- 
ciently small. 

In this chapter inventory policies are classified according to whether the demand 
is deterministic or random, and models are developed for continuous and periodic 
review policies. Instantaneous delivery will be assumed throughout. ' 


18.3 Deterministic Models 
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This section is concerned with inventory problems where the actual demand is assumed 
to be known. Several models are considered, including the well-known economic lot- 
size formulation. 


Continuous Review—Uniform Demand 


The most common inventory problem faced by manufacturers, retailers, and whole- 
salers is concerned with the case where stock levels are depleted with time and then 
are replenished by the arrival of new items. A simple model representing this situation 


1 Results for the delivery-lag case often can be obtained from the corresponding instantaneous delivery 
model by a simple modification in the calculation of some of the inventory costs. 
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is given by the economic lot-size model. Items are assumed to be withdrawn contin- 
uously at a known constant rate denoted by a; that is, a units are required per unit 
time, say, per month. It is further assumed that items are produced (or ordered) in 
equal numbers, Q at a time, and all Q items arrive simultaneously when desired (fixed 
delivery lags will be considered later)..The only costs to be considered are the setup 
cost K, charged at the time of the production (or ordering), a production cost (or 
purchase cost) of c dollars per item, and an inventory holding cost of A dollars per 
item per unit of time. The inventory problem is to determine how often to make a 
production run and what size it should be so that the cost per unit of time is a minimum. 
This is a continuous review inventory policy. We shall first assume that shortages are 
not allowed, and then we shall relax this assumption. The example of the production 
of speakers in television sets satisfies this model. 


SHORTAGES NoT PERMITTED: A cycle can be viewed as the time between pro- 
duction runs. Thus, if 24,000 speakers are produced at each production run and are 
used at the rate of 8,000 per month, then the cycle length is 24,000/8,000 = 3 
months. In general, the cycle length is Q/a. Figure 18.1 illustrates how the inventory 
level varies over time. 

The cost per unit time is obtained as follows: The production cost per cycle is 
given by 


0, ifQ =0 
K+cQ, ifQ>0. 


The holding cost per cycle is easily obtained. The average inventory level during a 
cycle is (Q + 0)/2 = Q/2 items per unit of time, and the corresponding cost is 
hQ/2 per unit of time. Because the cycle length is Q/a, the holding cost per cycle is 
given by 





hQ? 
2a ` 
Therefore, the total cost per cycle is 
h 2 
K + cQ + E 


Inventory level 





O Q 20 
a 


a 


Time t 
Figure 18.1 Diagram of inventory level as a function of time—no shortages permitted. 


and the total cost per unit of time is 
K + cQ + hQ?/2a aK hQ 
= = dG ah =: 
Q/a Q 2 


It is evident that the value of Q, say Q*, that minimizes T is found from dT/dQ = 
0. dT/dQ = —aK/Q* + h/2 = 0, so that 


2ak 
Fe he 


(because d?T/dQ? > 0), which is the well-known economic lot-size result. Similarly, 
the time it takes to withdraw this optimum value of Q*, say, f*, is given by 


Q* _ 2K 
a ah 





T 


These results will now be applied to the speaker example. The appropriate 
parameters are 


K = 12,000 
h = 0.30 
a = 8,000, 


((2)(8,000)(12,000) 
t * = =) > 
so tha Q \ 030 25,298 


25,298 
d * = ——— = 32 ths. 
an t 8.000 3.2 months 





Hence the production line is to be set up every 3.2 months and produce 25,298 
speakers. Incidentally, the cost curve is rather flat near this optimal value, so that any 
production between 20,000 and 30,000 speakers is acceptable; this fact can be seen 
in Fig. 18.3. 


SHORTAGES PERMITTED: It may be profitable to permit shortages to occur because 
the cycle length can be increased with a resultant saving in setup costs. However, this 
benefit may be offset by the cost that is incurred when shortages occur, and hence a 
detailed analysis is required. 

If shortages are allowed and are priced out at a cost of p dollars for each unit 
of demand unfilled for one unit of time, results similar to the no-shortage case can be 
obtained. Denote by S the stock on hand at the beginning of a cycle. The problem is 
summarized in Fig. 18.2. 

The cost per unit time is obtained as follows: The production cost per cycle is 
given by 


0, fQ =0 

K + cQ, fQ >00. 
The holding cost per cycle is easily obtained. Note that the inventory level is positive 
for a time of S/a. The average inventory level during this time is (S + 0)/2 = S/2 
items per unit of time, and the corresponding cost is AS/2 per unit of time. Hence 
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Inventory level 
7) 





Time ¢ 
Figure 18.2 Diagram of inventory level as a function of time—shortages permitted. 


the total holding cost incurred over the time the inventory level is positive is the 
holding cost per cycle, which is given by 
ASS _ hS? 


2a 2a 
Similarly, shortages occur for a time (Q — S)/a. The average amount of shortages 
during this time is [0 + (Q — S)]/2 = (Q — S)/2 items per unit of time, and the 
corresponding cost is p(Q — S)/2 per unit of time: Hence the total shortage cost 
incurred over the time shortages exist- is the shortage cost per cycle, which is given 
by 
pP ~ 5) Q- 5) _ p- Sy’ 
2 a 2a 


Therefore, the total cost per cycle is 


hs? — s} 
Eror e pe es 
2a 2a 


and the total cost per unit of time is 
_ K+ cQ + hS*/2a + p(Q — S)?/2a 
i Q/a 

aK hS pQ- S? 


Sig EEG 20 


In this model there are two decision variables (S and Q), so the optimum values 
(S* and Q*) are found by setting the partial derivatives dT/0S and ðT/3Q equal to 
zero. Thus 


T 





aT hS PQ - 8) _ 


a = o 0. 
= Tf _ aK WP -9 -Sy 
soo Ë 2@ Q 20° l 


Solving these equations simultaneously leads to 


s- PE [ox Pak pra 
h Vpth h p` 


The optimal period length r* is given by 695 
P 


ee Q* Z 2K [pth Inventory Theory 
a ah Dp 


The maximum shortage is expressed as 


o* — s* 2aK h 
p pt k 


Further, from Fig. 18.2, the fraction of time that no shortage exists is given by 


St/a p 
Q*/a pth’ 








which is independent of K. 
If shortages are permitted in the speaker example, the cost is estimated as p = 
$1.10 per speaker. Again, 

















K = 12,000 
h = 0.30 
a = 8,000, 
(2)(8,000)(12,000) 1.1 
* = = 22,42 
z j j 0.30 Ja ae eet 
(2)(8,000)(12,000) /1.1 + 0.3 
* = = 28,540, 
j J 0.30 j 1.1 : 
28,540 
d f= — =3, i 
an 3,000 6 months 


Hence, when shortages are permitted, the production line is to be set up every 3.6 
months to produce 28,540 speakers. A shortage of 6,116 speakers is permitted. Note 
that Q* and ¢* are not very different from the no-shortage case. 


QUANTITY DISCOUNTS, SHORTAGES NoT PERMITTED: The models considered 
have assumed that the unit cost of an item is the same, independent of the quantity 
produced. In fact, this assumption resulted in the optimal solutions being independent 
of this unit cost. Suppose, however, that there exist cost breaks; i.e., the unit cost 
varies with the quantity ordered. For example, suppose the unit cost of producing a 
speaker is c} = $11 if less than 10,000 speakers are produced, c, = $10 if production 
falls between 10,000 and 80,000 speakers, and c} = $9.50 if production exceeds 
80,000 speakers. What is the optimal policy? The solution to this specific problem 
will reveal the general method. 

From the results of the previously considered economic lot-size model (shortages 
not permitted), the total cost per unit time if the production cost is c; is given by 


aK hQ 
a os ee, 


A plot of T, versus Q is shown in Fig. 18.3. 


forj = 1, 2, 3. 
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Figure 18.3 Total cost per unit time for speaker example with quantity discounts. 


The feasible values of Q are shown by the solid lines, and it is only these regions 
that must be investigated. For each curve, the value of Q that minimizes T; is easily 
found by the methods used in the previously considered economic lot-size model. For 
K = 12,000, h = 0.30, and a = 8,000, this value is 


(2)(8,000)(12,000) 
0.30 





= 25,298. 


This number is a feasible value for the cost function T,. Because it is evident that for 
fixed Q, T, < T;_, for all j, T, can be eliminated from further consideration. However, 
T; cannot be immediately discarded. Its minimum feasible value (which occurs at 
Q = 80,000) must be compared to T, evaluated at 25,298 (which is $87,589). Because 
T, evaluated at 80,000 equals $89,200, it is better to produce in quantities of 25,298, 
and thus this quantity is the optimal value for this set of quantity discounts. If the 
quantity discount led to a cost of $9 (instead of $9.50) when production exceeded 
80,000, then T, evaluated at 80,000 would equal 85,200, and the optimal production 
quantity would become 80,000. 

Although this analysis concerned a very specific problem, its extension to a 
general problem is evident. Furthermore, a similar analysis can be made for other 
types of quantity discounts, such as incremental quantity discounts, where a cost co 
is incurred for the first gy items, c, for the next q, items, and so on. 


Remarks: Several remarks can be made about economic lot-size models. 


1. If it is assumed that the production (or purchase) cost of an item is constant 
throughout time, it does not appear in the optimal solution. This result is evident 
because no matter what policy is used, the same quantity is required, and hence this 
cost is fixed. 


2. It was previously assumed that Q, the number of units produced, was con- 
stant from cycle to cycle. A little reflection will reveal that this assumption is really 
a result rather than an assumption. 


3. These models can be viewed as a special case of an (s, S) policy. An (s, S$) 
policy is usually used in the context of a periodic review policy, where at review time 


an order is placed to bring the inventory level up to S if the current inventory is less 
than or equal to s. Otherwise, no order is placed.' The symbol S then denotes the 
reorder level, and s denotes the reorder point. In the economic lot-size models, s 
denotes the inventory level when items are ordered, so that when shortages are not 
permitted the reorder point s is zero, and when shortages are permitted, s is equal to 
the negative of the maximum shortage; that is, 


2aK h 
s= — 3 
= pth 


Furthermore, because the economic lot-size-models are continuously reviewed, when 
the inventory level equals s an order is placed, bringing the inventory up to the reorder 
level S. Hence, for economic lot-size models the (s, S) policy can be described as 
follows: When the inventory level reaches the reorder point s, place an order to bring 
the inventory level up to reorder level S; that is, order Q = S — s. 





4. It is evident from the analysis presented that the reorder point will never be 
positive. A policy that calls for s > 0 cannot be optimal because it is dominated by 
a policy that calls for ordering the same Q, but only when the reorder point reaches 
zero. It is dominated in that the latter policy has the same setup and purchase costs 
but a uniformly smaller holding cost. 


5. A known fixed delivery lag is easily accommodated. Denote by A the lead 
time between the placing and receiving of the order. It is assumed that A is constant 
over time and independent of the size of the order. It is evident that if it is desired to 
have the order arrive the moment the inventory level reaches s, then the order must 
be placed A periods earlier. Thus the reorder point is simply 


s + àa, 


where s is determined for the no-lag situation.? 


Periodic Review—A General Model for Production Planning 


The last section explored the economic lot-size model. The results were dependent 
upon the assumption of a constant demand rate. When this assumption is relaxed, 
i.e., when the amounts required from period to period are allowed to vary, the square- 
root formula no longer ensures a minimum cost solution. 

Consider the following model, due to Wagner and Whitin.? As before, the only 
costs to be considered are the setup cost Ķ, charged at the beginning of the period; a 
production cost (or purchase cost) of c dollars per item; and an inventory holding cost 
of A dollars per item, which is charged (arbitrarily) at the end of the time period. The 
choice of charging the inventory at the end of the period, and hence as a function of 
the excess of the supply over the requirement, is somewhat different from the holding 
charge incurred in the economic lot-size models. In the latter case, the average cost 


' Usually (s, S) policies are described as ordering when the inventory level is less than s rather than when 
the inventory level is less than or equal to s. However, the costs are often the same if an optimal policy 
is followed. 

? This result holds only when Aa < Q. 


> Wagner, H. M., and T. M. Whitin: “Dynamic Version of the Economic Lot-Size Model,” Management 
Science, 5(1):89-96, 1958. 
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per unit of time was. charged. Clearly, different policies can result from alternative 
ways of dealing with holding costs. In addition, r; = 1, 2,..., n represents the 
requirements at time i, and it is assumed that these requirements must be met. Initially 
there is no stock on hand. For a horizon of n periods, the inventory problem is to 
determine how much should be produced at the beginning of each time period (as- 
sumed to be instantaneous) to minimize the total cost incurred over the n periods. 
The model can be illustrated by the following variation of the speaker example. 


EXAMPLE: A market survey conducted by the television manufacturer has indicated 
that the demand for television sets is seasonal rather than uniform. In particular, sales 
of 30,000 sets is forecast for the Christmas: season (October to December), 20,000 
for the winter slack season (January. to March), 30,000 for the ‘“‘new model’ season 
(April to June), and 20,000 for the summer. season (July to September). Because of 
the necessity of meeting the increased demand during the peak seasons, the television 
set production line was revamped. This revised production line enabled the company 
to introduce new equipment as well as to redesign some of its components, including 
the speakers. Hence the setup cost for speaker production is now $20,000, but the 
unit cost is down to $1. Furthermore, the holding cost of a speaker has been reduced 
to 20 cents per (3-month) period. Finally, the: labor and equipment costs are such that 
the speakers must be produced in increments of 10,000. It is assumed that production 
of the television sets is completed and ready for shipment in the 3-month period prior 
to the season in which it is required. Thus the 30,000 sets. required for the Christmas 
season are to be assembled during the July-to-September period. The speaker is the 
last component. added to. the television. set, and it is easily installed. Furthermore, 
large quantities of speakers can be produced in a very short time, so their production 
and subsequent installation can be viewed as instantaneous. The problem is to deter- 
mine how many to produce in each period while satisfying the requirements and 
minimizing the total cost. A solution, though not an optimal one, is given in Fig. 
18.4; this policy calls for producing 30,000 speakers at the beginning of the first 
period (Christmas season), 60,000 speakers at the beginning of the second period, 
and 10,000 speakers at the beginning of the fourth period. 

One approach to the solution of this model is to enumerate, for each of the 
2”~! combinations of either producing or not producing in each period, the possible 
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Figure 18.4 Production schedule that satisfies requirements. 


quantities that can be produced. This approach is rather cumbersome, even for mod- 
erate-sized n. Hence a more efficient method is desirable. In particular, the method 
of dynamic programming introduced in Chap. 11 will be applied, followed by the 
introduction of an algorithm (in the ensuing section) that exploits the structure. Finally, 
a mathematical programming solution will be presented. 


THE DYNAMIC PROGRAMMING SOLUTION: Following the notation introduced for 
dynamic programming (Chap. 11) and interpreting the variables in the inventory con- 
text, the ith stage corresponds to the ith period; the state corresponds to the inventory 
entering period i and will be denoted by x,; and the decision variable corresponds to 
the quantity produced (or ordered) at the beginning of period i and will be denoted 
by z; 

Let B(x;, z;) denote the costs incurred in period i, given the entering inventory 
and the quantity produced. Recall that the only costs considered are the setup cost K, 
charged at the beginning of the period, a production cost (or purchase cost) of c dollars 
per item, and an inventory holding cost of h dollars per item charged at the end of 
the time period. The requirement for period i is r;. Then Bx; z;) is given by 


_ JK + cz; + h(x; + zi- ri) if z; > 0 
Bx, Z) a { h(x; at ri)s if z= 0. 


Denote by C,(x,, z;) the total cost of the best overall policy from the beginning 
of period i to the end of the planning horizon, given that the inventory level entering 
period i is x; and z; is chosen to be produced; and let C*(x,) denote the corresponding 
minimum value of C,(x;, z;), subject to the constraints that z; = 0 and that the re- 
quirements for the periods are met. Therefore, 


Ci) = minimum [B(x z) + Ci.,0; + z — rp), 
z,;20 
eee X} 
fori = 1,2,...,n, and CE AC) is defined to be zero. In the speaker example, 
there are four periods to be considered. (All calculations have been reduced in scale 


by a factor of 10,000.) The first iteration corresponds to i = 4, that is, a description 
of the optimal policy at the beginning of period 4. For this case, 


Ci(x,) = minimum [B,(x,, Z4)], 


4 = 2-x, 


so that the immediate solution to the fourth-period problem is as follows: 


2 4 
1 3 
0 0 


' For the notation in this chapter to be consistent with that used in inventory theory in general, it may 
differ somewhat from the notation introduced in Chap. 11. 
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The second iteration requires finding the optimal policy from the beginning of 
period 3 to the end of period 4. For this case, 


C3(x3) = minimum [B,(x3, z3) + Ca(x3 + z3 — 3)], 























A typical entry can be verified. Let x, = 2 and z; = 3. Then 3 units are produced, 
given an initial inventory of 2. The cost of production is then 2 + 1(3) = 5, and the 
inventory holding cost is $(2 + 3 — 3) = 0.4. The entering inventory in period 4 
is then 2, so that C3(2) = 0. Hence the total cost is given by 5 + 0.4 = 5.4, which 
is the value given. 

The third iteration requires finding the optimal policy from the beginning of 
period 2 to the end of period 4. For this case, 


C3(x>) = minimum [B(x, z2) + C3. + z2 — 2)), 


22 2-x, 


so that the solution to the problem beginning at period 2 is as follows: 
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Finally, the last iteration requires finding the optimal policy from the beginning 
of period 1 to the end of period 4. For this case, 


Ci) = minimum [B,(0, z) + Cz(z, — 3)], 


sp=3 


so that the solution to the production planning problem is as follows: 








B0, Z) + C3, 7 3) 
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Therefore, the optimal production schedule is to produce all the speakers at the 
beginning of the first period, or produce 50,000 speakers (5 units) at the beginning 
of the first period and 50,000 speakers (5 units) at the beginning of the third period. 
The minimum cost is $148,000. 

It should be noted that the unit cost c is elevant to the problem because over 
all the time periods, all policies use the same number of items at the same total cost. 
Hence this cost could have been neglected, and the same optimal policies would have 
been obtained. Different costs in different periods are easily handled by this method 
of solution. For example, during the peak production periods (when 30,000 speakers 
are produced), the workers are fully engaged in the assembly of television sets, so 
that the speakers must be produced during overtime hours. Hence the unit cost is 
increased. This increase would be reflected in the calculations of C3(x3) and C7 (x,). 


Periodic Review—Production Planning: An Algorithm 


In the previous section, dynamic programming was applied to solve the production 
planning model. Alternatively, by streamlining the dynamic prograniming approach, 
we shall develop an algorithm that exploits the structure of the model. Initially the 
production planning model we consider will be the same as that presented in the 
previous section, i.e., arbitrary demand requirement, a fixed setup cost, and linear 
production and holding costs. 

The following result characterizes an optimal policy: 

For an arbitrary demand requirement, a fixed setup cost, and linear production 
and holding costs, there is an optimum policy that produces only when the tepepeOny 
level is zero. 

To show why this result is true, choose any policy. Consider the time ao 
that begins when production is made with zero stock level and ends with the first time 
production is made when the stock is not at zero level. For the policy given in Fig. 
18.4, this interval starts at the beginning of period 2 and ends at the beginning of 
period 4, when one unit is produced. This time period is also shown by the solid lines 
in Fig. 18.5. 

Consider the alternative policy, which implies production of 50,000 speakers at 


the beginning of period 2 and production of 20,000 speakers at the beginning of period © 


4. This policy is shown by the dashed lines in Fig. 18.5. This policy, B, dominates 
policy A in that the total cost is smaller. The setup and the production costs for both 
policies are the same. It is evident that the holding cost for B is smaller than that for 
A because there is always less stock on hand at the end of a period. Therefore, B is 
better than A, so that A cannot be optimal. 
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Stock (in units of 10,000 speakers) 





Period 
Figure 18.5 Production schedules. 


This characterization of optimal policies can be used to determine which policies 
are not optimal. In addition, because it implies that the amount produced at 
the beginning of the ith period must be either 0, 7;, r; + Fip- - -5 7) + Tina + 
-++ + Fp it can be exploited to obtain an efficient algorithm. 

Suppose an optimal policy is presented. Consider the time from the initial pro- 
duction at the beginning of the first period to the first time the inventory level is again 
zero. The total cost of the subsequent periods must be a minimum for this reduced 
problem because the overall policy is optimal. Therefore, in the context of the speaker 
example, if the inventory level is zero for the first time (after the initial production) 
at the end of the second period, and if an optimal policy is being followed, all that 
remains to be done is to determirie the optimal policy for the last two periods (periods 
3 and 4) that have a requirement of 30,000 speakers for period 3 and 20,000 speakers 
for period 4. 

Let C; denote the total cost of the best overall policy from the beginning of 
period i, when no stock is available, to the end of the planning horizon, i = 
1,2,..., n. A recursive relationship for C; is given by 


C; = minimum [Cj,, + K + elf; + riai tiei + 7) 
jetitleun 


+ AG ia. + rine + Brigg to + G Dry), 


where j can be viewed as an index that denotes the (end of the) period when the 
inventory reaches a zero level for the first time after the production at the beginning 
of period i. The cost C,,,; is 0, e(r; + r;4, + °° + + 7;) is the cost of the production 
from. period i until the inventory level next reaches zero, and the quantity A( ) is the 
total holding cost of the inventory that results from the production from period i and 
remaining until the inventory level next reaches zero. This latter cost is charged at 
the end of every period as a function of the excess, if any, over the requirement. 

The solution of this. algorithm is much simpler than the dynamic programming 
approach. As in dynamic programming, C,, C,,), . - . , C> must be found before C, 
is obtained. However, the number of calculations is much smaller, and the number 
of possible. production values is greatly reduced. 


EXAMPLE: Returning to the speaker example, first consider the case of finding C4, 
the cost of the optimal policy from the beginning of period 4 to the end of the planning 


horizon: 
C=C, +2412) =04+24+2 = 40. 


To find C, we must consider two cases; i.e., the first time after period 3 when 
the inventory reaches a zero level can occur at (1) the end of the third period or (2) 
the end of the fourth period. In the recursive relationship j may range over period 3 
or 4, resulting in the costs C® or CS, respectively. The policy associated with CY 
calls for producing only for period 3 and then following the optimal policy for period 
4, whereas the policy associated with C$ calls for producing for periods 3 and 4. 
The cost C, is then the minimum of C$ and C$?. These cases are reflected by the 
policies given in Fig. 18.6. 


CP =C,+24+13)=44+24+3=9 
and CM =C,+24+13842)+$2=04+2+54+04=74. 
Hence C; = min(7.4, 9.0) = 7.4. 


To find C, we must consider three cases; i.e., the first time after period 2 when 
the inventory reaches a zero level can occur at (1) the end of the second period, (2) 
the end of the third period, or (3) the end of the fourth period. In the recursive 
relationship j may range over period 2, 3, or 4, resulting in costs C?’, CS, or CY, 
respectively. The cost C, is then the minimum of CY’, C$, and CP. 


CP =C,+2+ 12) =744+2+4+2 = 11.4, 


| 


CP =Cy +24 12 + 3) +43) =442454+ 06 = 11.6, 
and 
cP =C5+24+124+34+2) +38 + 2Q)=0+2+7414 = 104. 
Hence C, = min(11.4, 11.6, 10.4) = 10.4. 


Finally, to find C, we must consider four cases; i.e., the first time after period 
1 when the inventory reaches zero can occur at (1) the end of the first period, (2) the 
end of the second period, (3) the end of the third period, or (4) the end of the fourth 
period. In the recursive relationship j may range over period 1, 2, 3, or 4, resulting 
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Figure 18.6 Alternative production schedules when production is required at the beginning of period 3. 
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in the costs C{?, CO, CY, CM. The cost C, is then the minimum of C{?, C®, 
C®, and CY, 


CP = C, + 2 + 103) = 10.4 + 2 + 3 = 15.4, 
CP = C +2 +13 + 2) +42) =744+2+5+04 = 14.8, 
C® = C, +2 + 1842+ 3) + 32+ 28) =4+24+84+ 1.6 = 15.6, 
and CP =C +2 +18 +2+3 +2) + 2 + 23) + 32) 

= 0 +2 + 10 + 2.8 = 14.8. 
Hence, C, = min(15.4, 14.8, 15.6, 14.8) = 14.8, 


so that the optimal production schedule is to produce all the speakers at the beginning 
of the first period or to produce 50,000 speakers at the beginning of the first period 
and 50,000 speakers at the beginning of the third period (the same solution as obtained 
previously). Note that the 14.8 comes from C{® and CP. The policy associated with 
C® calls for producing items for periods. 1, 2,3, 4 at period 1 whereas C® calls for 
producing items for periods 1 and 2 at period 1, and then following an optimal policy 
from period 3 on. Such a policy has already been shown to call for production for 
periods 3 and 4 together. 

Again it should be noted that the unit cost c is irrelevant to the problem because 
over all the time periods, all policies use the same number of items at the same total 
cost. Hence this cost could have been neglected, and the same optimal policies would 
have been obtained. 

The characterization of the optimal policy, and the subsequent algorithm for 
finding an optimal policy, depended upon the assumption that the holding and pro- 
duction costs were linear. This constraint can be relaxed to include concave production 
and holding costs. In fact, any increasing function of the holding cost will serve as 
an alternative condition if the production cost is linear. However, these alternative 
conditions require a modification in the algorithm for finding an optimal policy. If the 
production cost function is denoted by c[-] and holding cost function A[-], the recursive 
relationship for C; becomes 


C; = minimum {Cj,,; + K + clr; + ry, + c+ 7] 
J=iitl,.., n 
+ Alisi t ae t Trig toe el 
+ Alri + riaa tiee tr] toe + hir, 


where C,,, = 0.! 

A natural extension of the production planning model permits shortages to occur. 
This extension has been studied by Zangwill? and differs from the production planning 
model studied in that shortage costs are incurred for each unit of demand unfilled for 
one unit of time. Zangwill characterizes the form of the optimal policy and gives an 
efficient recursive relationship for finding the optimal policy. These results apply when 
the production, holding, and shortage costs, per period, are concave functions (thereby 
including the case of linear costs). 


1 In the expression for C; there is no holding cost when j = i. 


2? Zangwill, W. I.: “A Deterministic Multi-Period Production anes Model with Backlogging,’’ Man- 
agement Science, 13(1):105-119, 1966. 


Other results have been obtained for this production planning model under the 705 
assumption that the production, holding, and shortage costs are convex. Convex pro- 
duction costs arise, for example, when there are several sources of limited production 
at different unit costs in a period. If one uses these sources up to capacity in order of 
ascending unit cost (as is optimal), the resulting production cost is convex in the total 
amount produced. Such an assumption about the production cost precludes the use of 
the setup charge. 
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Periodic Review—Production Planning: 
Integer Programming Formulation 


The final technique for solving the production planning model is to formulate it as an 
integer programming problem. Instead of presenting the general solution, we shall 
again solve the speaker production example. 

Again, let z; denote the quantity produced at the beginning of period i.! The 
costs to be considered are 


Production costs = setup costs + c(z, + Z + Z3 + Z4) 


ll 


and Holding costs = A(z; — r) + kzi + 22 — Fy r) 


+ Azi t Z +z =r r o r) 


It 


3A(z,; — ry) + 2A — ra) + Aza — r3) 
= 0.20[3(z; — 3) + X — 2) + (z; — 3)). 


Hence the problem to be solved is to minimize the sum of the production and holding 
costs; that is, 


Minimize W = setup cost + 1(z; + Z + Z3 + 24) 
+ 0.20[3(z, = 3) + Xz = 2) + (z3 — 3), 
subject to ZS rtm +r +r, = 10 
QS rg tr +7rg=7 
z =r +7, = 5 
4S r4 = 2 
y Zr, =3 
Ztruezntrmn=5 
yZtuytweertrtrs = 8 
Zt Z + Z3 + z4 = ry + r + r+ ry = 10 
and z 20, z = 0, z3 = 0, z4 = 0, and integer valued. 


Unfortunately, this model is not an integer linear programming model because 
the objective function is not linear as a result of the ‘‘setup costs’’ term that appears 


' Again, all values are expressed in units of 10,000 speakers. 
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Probabilistic Models Define variables v,, vz, v3, and v, such that 
vu, = 1, vy, = 1, v3 = 1, vg = 1, 
v, = 0, v, = 0, v3 2 0, v4 = 0, 
and Ui» Uz, U3, and v, are integers (v; is then 0 or 1). 


Add the following constraints to the problem: 


10v, 


oO 

IA 
a 
iMea 

ot 
“eee” 

S 

ll 


4 
z S 2 n) v = mW, 


4 
Z = e n) U3 = 5v, 
Z4 E F404 = 204. 


Thus if v; = 0, then z; = 0, and if v; = 1, then z; is less than or equal to the 
maximum production in period i. The objective function can now be written as a 
linear function subject to linear constraints; that is, 


Minimize W = Ku + 0 + U5 t vu + Ikei tzat g + Za) 
+ 0.20[3(2, — 3) + 2(z — 2) + (z3 ~ 33], 


subject to z = 10v, 


IA 


Z = Tv, 
Z3 S 5v3 
Z4 SS 204 
Zz Z 3 


yt y= 5 


V 


Z + % + 232 8 


Zt y+ 23, + z= 10 


z 2 0, Z = 0, z3 = 0, z4 = 0, and integer valued, 
and v =0, v, 2 0, v = 0, v4 = 0, 
v, = 1, v = 1, v = 1, v, Sl, and integer valued. 


The solution to this problem may be obtained by using an integer programming al- 
gorithm; it yields the same solution as. obtained previously, i.e., produce 100,000 
speakers at the beginning of the first period (z; = 10, v} = 1, and all other z; and v; 
equal zero), or produce 50,000 speakers at the beginning of the first period and 50,000 





speakers at the beginning of the third period (z; = z4 = 5, v) = v, = 1, and all 
other z; and v; equal zero). 
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18.4 Stochastic Models | 


This section is concerned with inventory problems where the demand for a period is 
a random variable having a known probability distribution. Both single-period and 
multiperiod models are analyzed. 


A Single-Period Model with No Setup Cost 


The second example discussed in Sec. 18.1 is concerned with a wholesale distributor 
of 10-speed bicycles. Suppose that this distributor is offered very favorable terms on 
the purchase of a model of a name-brand bicycle whose production is to be discon- 
tinued. This opportunity appears to be ideal for the forthcoming Christmas season, 
where, because production has been discontinued, the stores have been informed that 
no reorders are possible. The cost of each bicycle is $20, and it will be assumed that 
there is no settip cost incurred. The cost of maintaining an inventory is —$9 per 
bicycle. This cost includes $1, which represents the cost of capital tied up, warehouse 
space, and so on, and — $10, which is what the distributor can get for each bicycle 
remaining in the inventory after the Christmas season (the salvage value). Note that 
this cost of maintaining an inventory is obtained by combining the bicycle storage 
cost with the bicycle salvage value and results in a negative holding cost. Each bicycle 
is sold for $45 for a profit of $25. 

Two remaining cost components still require discussion, i.e., the unsatisfied 
demand cost and the revenue. If the demand exceeds the supply, those customers who 
fail to purchase a bicycle may bear some ill will, thereby resulting in a ‘‘cost’’ to the 
distributor. This. unsatisfied demand cost is simply the per-item quantification of the 
loss of goodwill times the unsatisfied demand whenever a shortage occurs. In the 
bicycle example, this cost is considered to be negligible. 

If we adopt the criterion of maximizing net income, we must include revenue 
in the model. Indeed, net income is equal to total revenue minus the costs incurred 
(ordering, inventory holding, and unsatisfied demand). The total revenue component 
is simply the sales price of a bicycle ($45) times the demand minus the sales price 
times the unsatisfied demand whenever a shortage occurs. The former is independent 
of the inventory policy, and, hence, can be neglected, whereas the latter is just the 
lost revenue when a shortage exists. This latter term behaves just like the aforemen- 
tioned description of the unsatisfied demand cost, and, hence, the two (the loss of 
goodwill and the lost revenue) can be combined (added) and considered to be the 
resultant unsatisfied demand cost. The unsatisfied demand cost will be so interpreted 
throughout this chapter. Thus, in the bicycle‘example, the unsatisfied demand is simply 
$45! times the unsatisfied demand whenever a shortage exists. 

Because the revenue times the demand term is independent of the inventory 
policy chosen, its omission from the net income expression results in terms that are 


1 In general, the cost of the loss of goodwill, assumed to be negligible in the bicycle example, must be 
added to the revenue to determine the complete unsatisfied demand cost. 
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the negative of total cost. Hence maximizing net income in this model is equivalent 
to minimizing total cost. 

The discussion about the interpretation of the unsatisfied demand cost assumed 
that the unsatisfied demand was lost. For the case where this unsatisfied demand is 
met by a priority shipment, the same principles prevail. The total revenue component 
becomes the sales price of a bicycle ($45) times the demand minus the unit cost of 
the priority shipment times the unsatisfied demand whenever a shortage occurs. If the 
West Coast distributor is forced to meet the unsatisfied demand by purchasing bicycles 
from the Midwest distributor at the same cost that bicycles are sold to the retail outlets 
($45) plus an air freight charge of, say, $2, then the appropriate unsatisfied demand 
cost is $47 per bicycle. Of course, any costs associated with loss of goodwill, if 
present, would be added to this amount. 

The previous discussion concerned the costs involved in the model, and little 
attention was paid to the concept of the demand for the bicycles. Unfortunately, the 
distributor does not know what the demand for these bicycles will be from the stores 
in the western United States; i.e., the demand is a random variable, and hence he 
does not know how many bicycles will be required. However, an optimal inventory 
policy can be obtained if information about the probability distribution of demand is 
available. Let D represent the random variable demand, and denote by Pp(d) the 
probability that the demand equals d; that is, 


P»(d) = P{D = d}. 


It will be assumed that P,(d) is known for all values of d; that is, the probability 
distribution is specified. 

In general, we consider the following inventory model. Items are purchased (or 
produced) for a single period at a cost of c dollars per item. The holding cost, the 
net unit cost of storing leftover items minus their salvage value, is given by A dollars 
per item and charged as a function of excess stock over the amount required. The 
cost of unsatisfied demand, e.g., lost revenue from sales or the cost of supplying a 
required unit, is given by p dollars per unit (p > c). It is assumed that there is no 
initial inventory on hand. Denote by y the quantity purchased (or produced) at the 
beginning of the period,’ and let D be a random variable that denotes the demand 
during the period. 

This single-period model may represent the inventory of an item that (1) becomes 
obsolete quickly, such as the bicycle in the example or a daily newspaper; (2) spoils 
quickly, such as vegetables; (3) is stocked only once, such as spare parts for a single 
production run of a new. model airplane; or (4) has a future that is uncertain beyond 
a single period. f 

With the aforementioned structure, the question of how much. inventory to have 
becomes relevant. More than the expected demand is probably desirable, but certainly 
less than the maximum demand: is required. A trade-off is needed between (1) the 
risk of being short and thereby incurring. shortage costs and (2) the risk of having an 
excess and thereby incurring wasted costs of ordering and holding excess units. One 


1 In the previous models the symbol z was used to denote the quantity produced, and x represented the 
inventory level at the beginning of the period. The symbol y is introduced here to denote the inventory 
level to be achieved after the ordering decision is made; that is, y is the amount ordered up to, so that 
y = x + z. Because the initial inventory level is assumed to be zero (that is, x = 0), y and z are the 
same. 


reasonable criterion is to choose the inventory level that minimizes the expected value 709° 

(in the statistical sense) of the sums of these costs. Inventory Theory 
The amount sold is given by 

D, fD<y_ 

F p=; min(D, y). 


Hence the cost incurred if the demand is D and y is stocked is given by 
C(D, y) = cy + p max(0, D — y) + h max(0, y — D). 


Because the demand is a random variable [with probability distribution Pp(d)], this 
cost is also a random variable. The expected cost is then given by C(y), where 


C(y) = ECD, y)] = 2 [cy + p max(0, d ~ y) + h max(0, y — d)]Pp(d) 


—i1 


E y 
=cy + > p(d — y)Pp(d) + 2 hy — d)P/(d). 
=y 


=0 


It is evident that C(y) depends upon the probability distribution Pp(d). Frequently a 
representation of this probability distribution is difficult to find, particularly when the 
demand ranges over a large number of possible values. Hence this discrete random 
variable is often approximated by a continuous random variable. Furthermore, when 
demand ranges over a large number of possible values, this approximation will gen- 
erally yield small differences in numerical values in the optimal amounts of inventory 
to stock. In addition, when discrete demand is used the resulting expressions may 
become slightly more difficult to solve analytically. Hence, unless otherwise stated, 
continuous demand is assumed throughout the remainder of this chapter. The proba- 
bility density function of this continuous random variable will be denoted by ¢)(&). 
The expected cost C(y) is then expressed as! 


C(y) = EICO, y)] i [cy + p max(0, E — y) + h max, y — E)JPplE) dé 


fea) 


Y 
=cy + [ P(E — yee) dE + I, Wy — S)@p(é) dé 


cy + L(y), 


where L(y) is often called the expected shortage plus holding cost. It then becomes 
necessary to find the value of y, say y°, which minimizes C(y). The optimal quantity 
to order, y°, is that value which satisfies 


0 PE 
$y’) TFR 


! The standard notation of probability theory is being followed. If X is a random variable having density 
function f,(y), and g(X) is a function of X, then 


El] = ie. g(y)fx(y) dy. 


Thus the expected cost is given by C(y). 
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where the function P(a) is the cumulative distribution function of the demand random 
variable; that is, 


a 


(a) = Í 


0 


Pplé) dé. 


The derivation of this solution is given on page 712. 
If D is assumed to be a discrete random variable having the cumulative distri- 
bution function 


b 
F,(b) = >, Pp(4), 
d=0 


a similar result for the optimal order quantity is obtained. In particular, the optimal 
quantity to order, y, is the smallest integer such that 

p= 

pth 





Foa) 2 


EXAMPLE: Returning to the bicycle example described in this section, assume that 
the demand has an exponential distribution given by 


1 
10,000 





e7 £/10,000 £=0 


ppl ) = 
0, otherwise. 


From the data given, 


c = 20 
p= 45 
h= —-9, 


Because the demand density is exponential, 


a 





ef —é/10,000 J — 1 — ,—a/10,000 
(a) i 70,000 € d€é=1-e i 


The optimum quantity to order, y°, is that value which satisfies 


45 — 20 
1 — ¢79/10,000 = —__—— = 0.6944 
i 45— 9 
y? = 11,856. 
Therefore, the distributor should stock 11,856 bicycles in the Christmas season. Note 
that this number is slightly more than the expected demand. 
Whenever the demand is exponential with expectation A, y° can easily be ob- 


tained from the relation 
+h 
yi = -Aln Z . 
pth 


MODEL WITH INITIAL STOCK LEVEL: As a slight variation of the previous model, 
suppose that the distributor has 500 bicycles of the aforementioned type on hand. 





How does this stock influence the optimal inventory policy? In particular, suppose 711 
that the initial stock level is given by x, and the problem is to determine how much 
is to be made available, y, at the beginning of the period. Thus (y — x) is to be 
ordered so that: 


Inventory Theory 


Amount available (y) = initial stock (x) + amount ordered (y — x). 


The cost equation presented earlier remains identical except for the term that 
was previously cy. This term now becomes c(y — x), so that minimizing the expected 
cost is given by: 


io} y 
Minimum [o — x) + Í PCE > y)Pplé) dé + I, h(y — €)pp(é) a| 
y2x y 


The constraint y = x must be added because it is assumed that items on hand 
at the beginning of the period cannot be depleted or returned. The optimum policy is 
described as follows. 

The inventory policy which satisfies, for p > c, 


co y 
Minimum -0 + {f P(E — y)ep(&) dé + f h(y ~ &@p(€) dé + a} 


yax 
is given by y = {%, so that 


Ifx< y, order up to y? (order y? — x), 





Ifx = y, do not order, 
-c 
where y° satisfies (y?) = - more 


Thus, in the bicycle example, if there are 500 bicycles on hand, the optimal 
policy is to order up to 11,856 bicycles (which implies ordering 11,356 additional 
bicycles). On the other hand, if there are 12,000 bicycles already on hand, the optimal 
policy is not to order. 


MODEL WITH NONLINEAR PENALTY Costs: Similar results for these models can 
be obtained for other than linear holding and shortage penalty costs. Denote the 
holding cost by 


Aly — D], ify=D 

i ify < D, 
where A[:] is a mathematical function, not necessarily linear. 
Similarly, the shortage penalty cost can be denoted by 

p[D — yl, ifD=y 

f if D< y, 


where p[-] is also a function, not necessarily linear. 
Thus the total expected cost is given by 


[ea] 


y 
cly =x) + i PIE — yleplé) dé + f hly — €lep(€) dé, 


where x is the amount on hand. 
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If L(y) is defined as the expected shortage plus holding cost, that is, 


P 7 
L(y) = l PLE — yep(€) d + I, hly — Elgp() dë, 


then the total expected cost can be written as 
c(y — x) + Ly). 
The optimal policy is obtained by minimizing this expression, subject to the 
constraint that y = x; that is: 
Minimum [c(y ~ x) + LO]. 


yeux 


If L(y) is strictly convex’ [a sufficient condition being that the shortage and 
holding costs each are convex and ~,(€) > 0], then the optimal policy is given by 


if x < y°, order up to y? 
ey. do not order, 


where y? is the value of y that satisfies the expression 


oO) +c=0. 
dy 
DERIVATION OF RESULTS FOR THE SINGLE-PERIOD MopEL witH No SETUP 
Cost AND LINEAR SHORTAGE AND HOLDING Costs”: A useful result for finding 
policies that minimize the expected cost is as follows. 
Let D be a random variable having a density function 


aa if €=0 


0, otherwise. 


Denote by (a) the cumulative distribution function; that is, 


a 


(a) = | eE at 


0 
Let 9(€, y) be defined as 





_fely- 8, ify >&e,>0 
alé, y) a a 
c£ — y), ify = é c> 0, 
and G(y) = f 8CE, y)Ppl€) dé + cy, 
where c > 0. Then G(y) is minimized at y = y°, where y? is the solution to 
ae 
g’) = 2 ; 
y’) Co + Cy 


To see why this value of y° minimizes G(y), note that, by definition, 


Y w 
Gy) = ci | O = DoE dE + ca |, E- DEE dE + oy 


1 See Appendix 1 for the definition of a convex function. 
? This section may be omitted by the less mathematically inclined reader. 


Taking the derivative (see Appendix 2) and setting it equal to zero leads to 


dG 7 
i = c J oE dE- caf eÐ dE + e = 0. 


y 


fo) 


This expression implies that 


ci E(P) — ofl — PO + ¢ = 0, 
because if plé) dé = 1. 
Solving this expression results in 


P) = B= 





Cy te, 
Checking the second derivative indicates that 
d?°G(y) 
dy? 


for all y, so that the result is obtained. 
To apply this result, it is sufficient to show that 


= (cy + c2)Pply) = 0 


= y 
C(y) = cy + Í PE — Y)Pnlé) dé + i h(y — €)@p(€) dé 


has the form of G(y). 
Clearly, c} = A, c, = p, andc = c, so that the optimal quantity to order, y°, 
is that value which satisfies 


p-e 
pth 





(y) = 


To derive the results for the case where the initial stock level is x, recall that it 
is necessary to solve the relationship 


0 T 
min E + I P(E — y)ep(&) dé + I, h(y — )pp(€) dé + ot | 


yuex 


Note that the expression in braces has the form of G(y), with cı = h, c = p, and 
c = c. Hence the cost function to be minimized can be written as 


min[—cx + G(y)]. 


yex 
It is clear that — cx is a constant, so that it is sufficient to find the y that satisfies the 
expression 


min G(y). 


yeu 
Hence the value of y° that minimizes G(y) satisfies 


© ek es 
(y) poh 
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Gy) 


GO’) 





Figure 18.7 Graph of G(y). 


a 


Furthermore, G(y) must be a convex function, because 


2 
dy) 
dy? 


Also, lim ay) = i 
yoo dy 


which is negative,! and 


in 22 = 


h+e, 
you dy 


which is positive. Hence G( y) must be as shown in Fig. 18.7. Thus the optimal policy 
must be given by the following: 


If x < y°, order up to y? because y° can be achieved together with the minimum 
value G(y°), 
If x = y°, do not order because any G(y), with y > x, must exceed G(x). 


A similar argument can be constructed for obtaining optimal policies with non- 
linear penalty costs when L(y) is strictly convex. 


A Single-Period Model with a Setup Cost 


In discussing the bicycle example in this section, it was assumed that there was no 
fixed cost incurred in ordering the bicycles for the Christmas season. In actual fact, 
the cost of placing this special order is $800, and this cost should be included in the 
analysis of the model. In fact, inclusion of the setup cost generally causes major 
changes in the results. 

In general, the setup cost will be denoted by K. To begin with, the shortage 
and holding costs will each be assumed to be linear. Their resultant effect is then 
given by L(y), where 


ži y 
LO) =p | E- neo dE +h |, O- DoD dé 


1 Ife — p is nonnegative, G(y) will be a monotone increasing function. This implies that the item should 
not be stocked; that is, y° = 0. 
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cy + L(y) 


sS S 
Figure 18.8 Graph of cy + L(y). 


Thus the total expected cost incurred if one orders up to y is given by 


K + ey- x) + L(y), ify>x 
L(x), ify = x. 


Note that cy + L(y) is the same expected cost considered in the previous section 
when the setup cost was omitted. If cy + L(y) is drawn as a function of y, it will ap- 
pear as shown in Fig. 18.8.! Define S as the value of y that minimizes cy + L(y), and 
define s as the smallest value of y for which cs + L(s) = K + cS + L(S). From Fig. 
18.8 it is evident that if x > S, then K + cy + L(y) > cx + L(x), for all y > x. 
Hence K + c(y — x) + L(y) > L(x), where the left-hand side of the inequality 
represents the expected total cost if one orders up to y, and the right-hand side of the 
inequality represents the expected total cost if no ordering occurs. Hence the optimum 
policy indicates that if x > S, do not order. If s = x = S, it is again evident from 
Fig. 18.8 that 


K + cy + L(y) = cx + L(x), for all y > x, 
so that K + ey — x) + L(y) = L(x). 


Again, no ordering is less expensive than ordering. Finally, if x < s, it follows from 
Fig. 18.8 that 


min[K + cy + LOJ = K + cS + LS) < cx + Li), 


Bee 


or min[K + c(y ~ x) + L(y] = K + eS — x) + L(S) < LO), 


yer 


so that it pays to order. The minimum cost is incurred if one orders up to S. 
Thus the optimum ordering policy can be summarized as follows: 


ifx<-s, order up to S 
ifx=s, do not order. 


The value of S is obtained from 





PO eis 
BO) aap 


1 In the derivation of results for the single-period model with no setup cost and linear shortage and holding 
costs, cy + L(y) was denoted by G(y) and rigorously shown to be a convex function of the form plotted 
in Fig. 18.8. f 
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and s is the smallest value that satisfies the expression 
es + L(s) = K + cS + LS). 
This policy is an (s, S) policy mentioned previously, and it has had extensive use in 
industry. 
EXAMPLE: Referring to the bicycle example of the previous section, 
y? = S = 11,856. 
If K = 800, c = 20, p = 45, andh = —9, s is obtained from 














2s +45 | E- a se S aa 
S x (£ 5) 10,000 e & 9 0 (s £) 10,000 © ae 
9 1 
E 3 — £/10,000 
800 + 20(11,856) + 45 oe (é — 11,856) 10,000 © dé 
11,856 1 
: 11,856 — — €/10.000 ge 
9 I, EO 8) 15,000" : 
so that = 10,674. 


Hence the optimal policy calls for ordering up to S = 11,856 bicycles if the amount 
on hand is less than s = 10,674. Otherwise, no order is placed. 


SOLUTION WHEN THE DEMAND DISTRIBUTION Is EXPONENTIAL: It may be of 
interest to solve this model, in general, when the distribution of demand, D, is ex- 
ponential; that is, 


1 
Polé) = ries: for E> 0. 


If A is defined as S — s, then A is the solution to the equation 
K A 


+41. 


AY rns = NS 
Ac + h) À 


Furthermore, a good approximation for A is given by 


| 2AK 
A= ae ye 


Note that s=S-A., 





These results are easily obtained. From the no-setup cost results, 


1 = EOD = PZE 


pth 
h+p 
= Àl i 
a (1) 
For any y, 


y i E i 
cy + Ky) = ey + a f O = E) ye dé + p Í, E- y) ye dé 


va 





or 


(c + hy + Alh + pe™h — Ah. 


Evaluating cy + L(y) at the point y = s and y = S leads to 717 
(c + h)s + Alh + pe~ — Ah =K + (c + WS + Alh + pes — àh, Inventory Theory 
or (c + hs + AA + per = K + (c + AS + Me + h). 
Although this last equation does not have a closed-form solution, it can be solved 


numerically quite easily. An approximate analytical solution can be obtained as fol- 
lows. By letting A = S — s, the last equation yields 


K A 


A/A = + 
ACHA A 





+ 1. 


If A/A is close to zero, eĉ⁄^ can be expanded into a Taylor series around zero. If the 


terms beyond the quadratic term are neglected, the result becomes 


4 we K A 
A 2X Aeth À 


20AK 
so that A= 7 
c+h 


Using the approximation in the bicycle example results in 


wE /(2)(10,000)(800) = 1,206, 
20 - 9 


which is quite close to the exact value of A = 1,182. 





1+ + 1, 





MODEL WITH NONLINEAR PENALTY Costs: Again it is evident that these results 
can be extended easily to any strictly convex expected shortage plus holding cost, 
L(y). This extension results in a strictly convex cy + L(y), similar to Fig. 18.8. 
Hence the optimal ordering policy is of the form 


ifx<s, order up to S 
ifx = s, do not order, 


where S is the value of y that satisfies 
diy) _ 


Ce 
dy 


0, 


and s is the smallest value that satisfies the expression 


cs + Lis) = K + cS + L(S). 


A Two-Period Inventory Model with No Setup Cost 


The single-period model was illustrated with a bicycle example where the distributor 
had only one opportunity to place an order. In many situations an opportunity to place 
an order occurs periodically, e.g., monthly, and the inventory manager must make a 
decision about whether and how much to stock. The manager may be interested in 
these decisions for a horizon of the next 2 months, 12 months, 18 months, or even 
possibly forever. Even for a horizon of 2 months, the periodic review policy of using 
the optimal one-period solution twice is not generally the optimal policy for the two- 
period problem. Smaller costs can usually be achieved by viewing the problem from 
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a two-period viewpoint and then using the methods. of dynamic programming intro- 
duced in Chap. 11 to obtain the best inventory policy. In fact, the only difference in 
concept in the dynamic programming approach for solving the inventory problem 
compared to the material in Chap. 11 is the probabilistic aspects that are introduced 
by the random demand. 


Two-PERIOD MopEL—No Setup Cost: Suppose that the bicycle distributor is 
permitted to make at most one reorder, to occur on November 15, after placing an 
initial order for the special type 10-speed bicycle on October 15. The assumptions 
about the costs are similar to those presented before. Assume that the purchase leads 
to immediate delivery; shortages at the end of the first period are to be made up, if 
they exist (backlogging of orders is possible at the end of the first period, but not at 
the end of the second period); no disposal of stock is permitted. Furthermore, the 
demands D,, D, for the two periods are independent, identically distributed random 
variables having density gp(é). The purchase cost is linear, that is, cz, where z is the 
amount ordered (no setup cost), and the shortage and holding costs are also linear, 
with respective unit costs denoted by p and h. 

As indicated earlier, the solution to this problem is not to use the optimal one- 
period solution twice. Smaller costs can be achieved by viewing the problem from a 
two-period dynamic programming viewpoint. Order the time periods so that the be- 
ginning of time period 1 implies that there are two periods left in the horizon. Simi- 
larly, the beginning of time period 2 implies that there is one period left in the horizon 
(the last period is beginning). The problem is to find critical numbers that describe 
the optimum ordering policy. It will be shown that these numbers are single critical 
numbers for each period. These numbers will be denoted by y? and y$. Furthermore, 
y, and y, (without the superscript) represent any amount of stock ordered up to at the 
beginning of the respective period. 

Denote by C(x) the expected cost of following an optimum policy (minimum 
cost) from the beginning of period 1 to the end of period 2, given that there are x, 
units on hand. Similarly, denote by C,(x,) the expected cost of following an optimum 
policy (minimum cost) from the beginning of period 2, given that there are x, units 
on hand. C,(x,) is the expression sought because this expression is obtained by fol- 
lowing the optimum policy for the entire (two-period) horizon. To obtain C,(x,) it is 
necessary to first find C,(x,). From the results for the single-period model, the optimal 
policy for a one-period problem is given by a single critical number found from ` 


pao. 
pth 





y3) = 
that is, if x, is the amount available at the beginning of the last period, then 


order (y? — x), if x, < y} 
do not order, if x, = y}. 


The cost of this optimum policy can be expressed as 


— JL), if x. = y3 
Co fee — x) + L9,  ifx,<y$, 


where y$ is the single critical number just determined, and L(z) is the expected shortage 
plus holding cost for a single period when there are z units available. L(z) can be 
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At the beginning of period 1, the costs incurred consist of the purchase cost 
c(y,; — xı), the expected shortage plus holding cost L(y,), and the costs associated 
with following an optimal policy during the second period. Thus the expected cost of 
following the optimal policy for two periods is given by 
C,@) = minimum {e(y, — x) + L(y) + ELC), 
J =X) ; 
where E[C,(x,)] is obtained as follows: Note that x, is a random variable that depends 


upon the amount of stock on hand at the beginning of period 2; that is, x, = 
yı — D,. Thus, 


2 ps ee See DM: ify, - Diz ye 
cove) AS ne ee sy +D)+ 10%), Hyr Di< yh 


Hence C(x) is a random variable, and its expected value is given by 


R yy 


E(C,(%)] = Í Cy, — EPplé) dé = i 


0 


Ly, — ENPE) dé 


+ = eO? — yi + E) + LOYD gn(S) dé. 


Note that because shortages are permitted, (y, — €) can be negative; further note that 
E[C,(x,)] is just a function of y, and y$, with y3 obtained from the solution to the 
single-period problem. Thus C,(x,) can now be expressed as 


yay 


C,@) = minimum fe =x) + L(y) + Í L(y, — €)@l€) dé 


vy = xy a 


+ Í [O2 y + & + LODE) a), 
YT 2 


It can easily be shown that C,(x,) has a unique minimum so that the optimal 
value of y,, denoted by y9, satisfies the equation 


=p + (p + A®(y!) + (e — poy? — yD 


y? 


toth] BOI- Hen dé = 0, 


where (a) denotes the cumulative distribution function; that is, 


a 


(a) = Í 


0 


Ppl) dé. 


Thus, if x, is the amount available at the beginning of period 1, then the optimal 
policy is to 


0 


order (y? — x), ifx,<yf 
not order, if x, = yj. 
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If the density function of demand is uniform over the range 0 to ż, that is, 


: fosést 
PE = 4! 


0, otherwise, 


then y? can be obtained from the expression 


o P xe- p)| o , P2ppt+hAt+ht+o)’) hato 
A os + 2S cae (p + hy? pth’ 





Finally, if the density function of demand is exponential, that is, 
1 
Polé) = Ve, for E> 0, 


the optimal value of y? satisfies the relationship 


(p + AOL- yd) ey 


` = 2h + c. 





(h + che VITIA + (p + he™™> + 


An alternative way of finding y? is to let 2° denote (y? — y$)/A. Then 2° satisfies the 
relationship 


eh +c) + (p + bye + p + hje™™?/ = 2h + c, 
and yo = Az? + y 


EXAMPLE: Consider the following example. The cost of producing an item is $10 
per item (c = 10). If any excess inventory remains at the end of a period, it is charged 
at $10 per item (A = 10); if no excess appears, there is no holding charge. If a 
shortage occurs within a period, there is a penalty cost of $15 per item (p = 15). 
The density function of demand is given by 


= fOsé= 10 
Polé) = 


0, otherwise. 


It is necessary to find the optimum two-period policy. For linear costs, the equation 
for the optimal single-period model becomes 


Hence, because ®(y3) = y$/10, 
yy 2. 


Because the distribution of demand is uniform (with parameter ¢ = 10), the 
optimal value of y? is found from 








o Jaa [aoao - 15) a [20505 + 10) + (10 + 10)? 
A Jo +| (15 + 10) ]2+ ao | G5 + 10)? 


_ 1000 + 10) 
15 + 10 


= V4 — 8 + 184 — 8 = 13.42 — 8 = 5.42. 


Substituting y? = 5 and y9 = 6 into C(x) leads to a smaller value with y? = 5. 
Thus the optimal policy can be described as follows: If the initial amount of stock on 
hand is less than 5, order up to 5 units (order 5 — x, unit). Otherwise, do not order. 
After a period has elapsed, and at the beginning of the last period, if the amount on 
hand is less than 2 units, order up to 2 units (order 2 — x, units, where x, may be 
negative). 

This two-period model has been solved by using the same penalty shortage cost 
for each of the two periods, even though unsatisfied demand is backlogged at the end 
of the first period and lost at the end of the second period. This is a deficiency of the 
model as presented, but it can be remedied by using different penalty shortage costs 
in each period. The computational procedure of the modified model is similar, and 
no additional difficulties are added. 


Multiperiod Models—An Overview 


The two-period model can be extended to several periods or to an infinite number of 
periods. This section presents a summary of multiperiod results that have practical 
importance. 


MULTIPERIOD MopEL—No Setup Cost: This model is an extension of the two- 
period problem. Suppose there exists a horizon of n periods for which we need to 
determine the optimal inventory policy. As before, assume that production leads to 
immediate delivery; shortages are to be backlogged except at the final period when 
they are lost; no disposal of stock is permitted. Furthermore, the demands for the n 
periods are independent, identically distributed random variables having density 
plé). The purchase cost is linear, that is, cz, where z is the amount ordered (no 
setup cost), and the expected (one-period) shortage plus holding penalty cost L(y) is 
strictly convex [which is the case if each cost is linear and gp(€) > 0]. A cost 
discounting factor is included; it is denoted by a, O<a< 1. 

Again the time periods are ordered so that the beginning of time period 1 implies 
that there are n periods left in the horizon. Similarly, the beginning of time period n 
implies there is one period left in the horizon (the last period is beginning). The 
problem is to find critical numbers that describe the optimal ordering policy. As in 
the two-period model, the numbers y, y9, AEE y? are difficult to obtain numerically, 
but it can be shown that the optimal policy has the form 


At the beginning of period i, i = 1,2,...,n, 


order up to y? (order y? — x;), if x< y? 
do not order, if x; = yl. 


Furthermore, WE- SE Ss y= yi 
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For the infinite-period model (where the inventory decisions are made indefinitely), 
there exists a single critical period number, y°, so that the optimal policy now has the 
form 


At the beginning of periodi, i = 1,2,..., 


order up to y? (order y? — x;), if x;< y’ 
do not order, if x; = y’. 


Furthermore, y° is easily obtained because y° is the value of y that satisfies the expres- 
sion 
dL(y) 


3 + ecel- a =0. 


For the case of linear shortage and holding costs, unit costs of p and A, respectively, 
y? simply satisfies 


o _ P-a) 

Doy’) ee 

A VARIATION OF THE MULTIPERIOD INVENTORY MopEL—No SETUP Cost: A 
slight modification of the aforementioned model leads to some simple, but interesting, 
results.! Consider the n-period model, but now assume that stock left over at the end 
of the final period can be salvaged with a return of the initial purchase cost c. Similarly, 
if there is a shortage at this time, the items are supplied also at the purchase price c. 
These two changes may lead to more realism in the model, but, what is equally 
important, they lead to rather simple optimal policies. In particular, the same single 
critical number y° is used for all periods, and, furthermore, the optimal policy is the 
same as that just presented in the infinite-period model; i.e., 


At the beginning of period i, i = 1,2,...,n, 


order up to y? (order yo — x), x; < y® 
do not order, x= y, 


where y? satisfies the expression 


dL 
ZO) +1- a) =0. 
dy 
Of course, this same result also holds for the infinite-period model. For the case of 
linear shortage and holding costs, unit costs of p and A, respectively, y? simply satisfies 


o CI) 

OO") = ares a 

EXAMPLE OF A MULTIPERIOD MODEL: Example 2, Sec. 18.1, concerned the 
western distributor of bicycles who is having trouble with shortages of the most 
popular 10-speed model. The unit shortage cost p was estimated to be $15. The holding 
cost A was determined to be $1 per bicycle remaining at the end of the month. The 
actual cost of a bicycle is $35. Orders may be placed on the first working day of each 


1 This formulation is due to A. F. Veinott, Jr.: ‘The Optimal Inventory Policy for Batch Orderings,’’ 
Operations Research, 13(3):424—432, 1965. 


month. The distributor always places an order for some model bicycles each month, 
so he is willing to assume that the marginal setup cost is zero for this most popular 
model. The discount factor may be assumed to be a = 0.995. From past history, the 
distribution of demand can be approximated by a uniform distribution given by 


Bans if 0 = € = 800 
800 
Ppl) = 


0, otherwise. 


The distributor expects to stock this model indefinitely, so that the infinite-period 
model is appropriate. 
Because the shortage and holding costs are linear, y° satisfies 





— c(l — a) 

®@ 0 = P c( 

(y) ooh 

Because the demand distribution is uniform, 
0 

0) J 

Dy") 300° 

— cl - 15 — 35(1 — 0. 
and ES) Len 295) — 9.997, 





pth 15 + 1 


sO yo = 741. Thus, if the number of bicycles on hand, x, at the first of each month 
is fewer than 741, the optimal policy calls for ordering up to 741 (ordering 741 — x 
bicycles). Otherwise, no order is placed. Note that if this policy is to be in effect for 
only a finite number of months, say, 24, and if (1) bicycles remaining in stock after 
24 months can be salvaged at $35 per bicycle and (2) unsatisfied demand remaining 
at the end of 24 months can be supplied to the western distributor from, say, the 
eastern distributor, at $35 per bicycle, then the policy of ordering up to 741 bicycles 
each month is still optimal. 


MULTIPERIOD MODEL with SErup Cost: The introduction of a fixed setup cost 
K that is incurred when ordering often adds more realism to the model. Unfortunately, 
however, the mathematics becomes very cumbersome, and the results available are 
primarily those that characterize the form of the optimal policy. In particular, if the 
ordering cost is assumed to be K + cz for z > 0 and zero for z = 0, and if L(y) is 
strictly convex, then the optimal policy has the form 


At the beginning of period i, i = 1,2,...,n, 


order up to S; (order S; — x;), ifx,< 8; 
do not order, if x; = Sp 


This policy is the familiar (s, S) policy alluded to earlier in the discussion of 
one-period models. As previously mentioned, unfortunately exact computations of s; 
and S; for the finite- or infinite-horizon model are extremely difficult. However, the 
importance of this result cannot be minimized. Even if the exact s; and S; are unknown, 
it is important to know that one should consider using policies of this form rather than 
a policy from another class. 
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(k, Q) POLICIES FOR A MULTIPERIOD MODEL WITHOUT SETUP CosT: In the 
discussion of the previous models, any. quantity could be ordered at the time of 
inventory review. Suppose that an additional constraint is placed on the ordering 
policy; i.e., each order for stock must be some nonnegative integral multiple of Q, a 
fixed positive constraint. 

As before, the demands for the n periods are assumed to be independent, iden- 
tically distributed random variables having density pp(€). The purchase cost is linear, 
and the expected (one-period) shortage plus holding penalty cost L(y) is strictly con- 
vex. & is the cost discounting factor, 0 < a < 1. At the beginning of each period 
the system is reviewed. An order may be placed for any nonnegative integral multiple 
of Q, a fixed positive number. Thus orders must be placed in multiples of some 
standard batch size, e.g., a case or a truckload. When the demand exceeds the in- 
ventory on hand, the excess demand is backlogged until it is subsequently filled by a 
delivery. In addition, it is assumed that stock left over at the end of the final period 
n can be salvaged with a return of the initial purchase cost c. Similarly, if there is a 
shortage at this time, the items are backlogged also at the purchase price c. This model 
was introduced by Veinott.' He shows that a (k, Q) policy is optimal. A (k, Q) policy 
is described as follows: 


If at the beginning of a period the stock on hand is less than k, an order 
should be placed for the smallest multiple of Q that will bring the stock level 
to at least k (and probably higher); otherwise, an order should not be placed. 
The same parameter k is used in each period. 


The parameter k is chosen as follows: Let y° be the minimizer of G(y) = 
(1 — acy + L(y). G(y) must be of the shape (convex) shown in Fig. 18.9. Then k 
is any number for which k = y° = k + Q and G(k) = G(k + Q). Refer to Fig. 
18.9. If a “‘ruler’’ of length Q is placed horizontally into the ‘‘valley,’’ k is found to 
be that value of the abscissa to the left of y° where the ruler intersects the valley. 
Note that if the initial inventory on hand lies in R,, then Q is ordered; if it lies in Ry, 
then 2Q is ordered; and so on. 

It should be noted that the same value of parameter k is used for each period of 
a finite-horizon model as well as for the infinite-horizon model. In the latter situation, 
this policy is the same optimal policy that would have been obtained if salvage costs 
and backlogging at the last period were omitted. 


CONTINUOUS REVIEW MODEL WITH FIXED DELIVERY LaG—BACKLOGGING: In 
Sec. 18.3, a deterministic continuous review model, i.e., the economic lot-size model, 
was considered. The demand was assumed to be continuous, and required at a known 
constant rate. This model was classified as continuous review in that the inventory 
was continuously monitored, and orders were placed at any time; i.e., orders were 
placed whenever the inventory level reached the order point (or the trigger point). 
This ordering procedure is in contrast to the models considered in this section, which 
assume stochastic demand and periodic review; i.e., the inventory is monitored at 
certain times, and orders are placed: at only these times. The model considered now 
is analogous to the economic lot-size model (continuous review), but where the de- 
mand for the item is stochastic and there is a fixed delivery lead time before an order 


' This formulation is due to A. F. Veinott, Jr.: ‘‘The Optimal Inventory Policy for Batch Orderings,”’ 
Operations Research, 13(3):424—432, 1965. 
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Figure 18.9 Plot of the G(y) function. 


is received. Only (s, S)-type policies’ are considered; i.e., when the inventory falls 
to a level s, an order is placed to bring the inventory up to S (a quantity Q = 
S — sis ordered). This model is often called a lot-size reorder-point model; a quantity 
Q is ordered whenever the inventory level reaches the reorder level s. Unsatisfied 
demand will be assumed to be filled immediately upon replenishment of the inventory; 
i.e., unsatisfied demand will be backlogged. In such models it is useful to consider 
the inventory level as the inventory position. The inventory position is defined as the 
amount on hand plus the amount ordered less the back-orders. Using the inventory 
position as the measure of inventory avoids the problem that may occur when the 
received order quantity fails to bring the on-hand inventory above the reorder point. 

The model can be described in detail as follows. Inventory is stockpiled and 
used as demand dictates. When the inventory position reaches s, an order is placed 
for Q units to bring the inventory position up to level S. There is a fixed delivery lead 
time of length A before the order is received, and the demand for items from inventory 
during the time A is assumed to be a continuous random variable, D, having probability 
density function denoted by gp(€). The mean demand is assumed equal to aA; that 
is, 


E(D) = ad, 


where a is the expected number of items demanded per unit of time. 

Figure 18.10 illustrates how the inventory level varies over time. Note that this 
diagram can be viewed as a series of cycles, with a cycle defined as the time between 
receipt (or placement) of consecutive orders. Also, note that if the demand during the 
period A is large, it is possible to have a negative inventory level. It is assumed that 
this unsatisfied demand will be backlogged. In Fig. 18.10, the inventory position is 
represented by the dashed lines during the period of the delivery lead time, and by 
the solid lines at other times. 

The costs to be considered are the ordering cost K, the cost of the units ordered, 
cQ (both charged at the time of the ordering), an inventory holding cost of h dollars 
per item per unit of time, and a shortage cost of p dollars for each unit of demand 
unfilled and independent of the duration of the shortage. The inventory policy is to 


1 Under Remark 3 after the economic lot-size model, it was pointed out that these models can be viewed 
as a special case of an (s, $) policy. 
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Figure 18.10 Diagram of inventory level as a function of time. 


track the inventory position so that when the inventory position reaches s, an order 
of size Q is placed; this order will be delivered after a period of length A. The problem, 
then, is to determine when to place an order (find the order point or trigger point, s) 
and to determine what size it should be (find the order quantity, Q), so that the 
expected total cost per unit time is a minimum. 

The expected total cost per unit time [C(Q, s)] consists of the sum of the 
following three components: the expected ordering cost per unit of time [E(OC)], the 
expected holding cost per unit of time [E(HC)], and the expected shortage cost per 
unit of time [E(SC)]; that is, 


C(Q, s) = E(OC) + E(HC) + E(SC). 


In order to evaluate the terms of this expression, we assume initially that A is 
sufficiently small that there is never more than a single order outstanding and that the 
reorder point, s (based on the inventory position), is always nonnegative. The first 
assumption guarantees that the inventory on hand when an order is received will 
always fall above the reorder point because, otherwise, more than one order would 
be outstanding. If p/h is sufficiently large, as is usually the case in practice, these 
assumptions are generally satisfied. 

The expected ordering cost per unit time, E(OC), is simply the ordering cost 
incurred per cycle times the expected number of cycles per unit of time. The ordering 
cost incurred per cycle is 


K + cQ. 


In order to find the expected number of cycles per unit of time, assume that the unit 
of time is, say, a year. Suppose that Q is 500 units, the lag in delivery time is 1 
month (7s year), and the expected demand in 1 month is 100 units. These suppositions 
imply that a = 1,200 since a is defined as the expected demand per year (the unit of 
time chosen). It is evident that the expected time to deplete 500 units of inventory is 
4 year (500/1,200). Because a cycle is defined as the time between the receipt of 
orders of 500 units, the expected cycle length is equal to 7 year, so that in 1 year 
the expected number of cycles is 4? = 2.4. In general, the expected number of cycles 
per unit time is given by a/Q. The expression for the expected ordering cost, E(OC), 
is then given by 


E(OC) = (3) (K + c0). 


The expected holding cost per unit time is equal to the expected inventory 
holding cost per cycle times the expected number of cycles per unit time (already 
obtained as a/Q). The expected holding cost per cycle is the expected holding cost 
per unit of inventory for a cycle times the average amount of inventory held during 
the cycle. 

The expected holding cost per unit of inventory for a cycle is the unit holding 
cost, h, times the expected length of a cycle Q/a, that is, hO/a. The average amount 
of inventory held during a cycle can be obtained by averaging the on-hand inventory 
at the beginning and end of a cycle. From Fig. 18.10, the expected inventory level 
at the beginning of the cycle is given by S — aà, and the expected inventory level 
at the end of the cycle is s — aA [assuming the expected number of stockouts (negative 
inventory) can be neglected, which is reasonable if the back-order state during a cycle 
is small compared to the length of the cycle]. Hence the average amount of inventory 
during a cycle can be approximated by 


(S ~ aA) + (s~ aA) Qt+s-arkt+s5—- aa 
2 i 2 


so that  E(HC) = (3) (2) (2 +s aa) =h (2 Hos a), 


The expected shortage cost per unit time, E(SC), is simply the expected shortage cost 
incurred per cycle times the expected number of cycles per unit time (already obtained 
as a/Q). The expected shortage cost per cycle is just p times the expected number of 
shortages that occur during the lag period A (a shortage can occur only when the 
demand during the delivery lead time exceeds s); i.e., 





ars 
2 





p | E- 90E) dé, 


so that E(SC) = (3) (> f E- 9608 a) | 


Adding the expressions for E(OC), E(HC), and E(SC) leads to 
C(O, s) = a +acth (8) Eys a| + (2) i (E — DPE) dé. 


Because there are two decision variables (Q and s), the optimum values (Q* 
and s*) are found by minimizing C(Q, s) with respect to the variables Q and s, that 
is, setting the partial derivatives OC(Q, s)/dQ and dC(Q, s)/ds equal to zero. Thus 


pa | E- oð ag] 
aC(Q,s) —aK h s 
dQ Q 2 Q’ 


pa f Pol) ae] 
aC(Q, s) = i 
ðs Q 








= 0. 
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Solving these equations simultaneously! leads to 


faa [x +p |E- DeD a|} 


h , 








0) Q* = 


k 
(2) f. ote ae = "2 


ho? 
pa` 

Unfortunately, solving these equations simultaneously and obtaining a general 
closed-form expression for Q* and s*-is not possible, so an iterative procedure is 
desirable. Such a procedure is as follows. 


1. As an initial step, assume that p equals zero and obtain a value of Q from 
Eq. (1). Note that this equation is just the expression for Q in the determin- 
istic economic lot-size model when shortages are not permitted. 

2. Solve for s in Eq. (2) using the value of Q found in step 1. 

3. Using the value of s found in step 2, solve for a new Q using Eq. (1). 

4. Repeat steps 2 and 3 until successive values of Q and s are sufficiently close. 


In practice, this procedure will generally converge in just a few iterations. 
Remarks: Several remarks can be made about this continuous review model. 


1. Note that in Eq. (2), the integral f2 pp(€) dé is just the probability that the 
random variable demand, D, during the lead time exceeds s*, that is, P{D > s*}. 
Hence, (hQ*/pa) must fall between 0 and 1. If the algorithm ever leads to a value of 
(hQ/pa) > 1, this is an indication that the shortage cost (relative to the holding cost) 
is too small, with the result that the number of back-orders during a cycle will tend 
to be large. This will contradict the approximation of neglecting the number of stock- 
outs made in the calculation of the expected holding cost, so that the derived formulas 
become inappropriate. 


2. If the lead time is close to the average cycle length, Q/a, more than one 
order may be outstanding. Using the inventory position as the measure of inventory 
still leads to an operational rule, i.e., order when the inventory position reaches s. 
However, the number of stockouts may become too large to be neglected in the 
expected holding cost calculations. 


3. The quantity (s — aA) is known as the safety stock, and represents ‘‘pro- 
tection’’ against a stockout during the lead time. The probability of a stockout is 
simply P{D > s} = f? @p(é) dé and can be obtained from hQ*/pa when the optimal 
reorder point and lot size are used. 


4. Because there is no closed-form solution for Eqs. (1) and (2), it is worth 
considering some special cases of distribution of demand. If the density function of 
demand is uniform over the range from 0 to t, that is, 

1 ; 
= fOsé<t 
Polé) = 4! 


0, otherwise, 


1 Note that the optimum values of Q and s are independent of c, the unit cost of the items ordered. A little 
reflection indicates the total number of units ordered is independent of the values of Q and s, and, therefore, 
can be neglected in determining the optimum values of these parameters. 
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Finally, if the density function of demand is exponential, i.e., 


Palé) = (4) e/a for E> 0 


a 


then Eq. (2) leads to 
f eot@ ag = env 
h x 
so that s* = —aàln (12°), 
pa 
Furthermore, from Eq. (1), 
(E — SPE) dé = ahe, 


2aK + 2a*Ape*/* 2 ` 
and Q* = fe A P o fe (K + adpes/), 








EXAMPLE: Consider the speaker example presented in Sec. 18.1. It was assumed 
that K = 12,000, h = 0.30 per speaker per month, and a = 8,000 per month. In 
this example, the demand was assumed to be fixed at the rate of 8,000 speakers per 
month. Now we assume that the demand is random and has a uniform density over 
the range from 0 to 16,000, with the unit of time considered to be 1 month and the 
delivery lead time also 1 month. Note that the expected demand is 8,000 speakers, 
so that a is again 8,000 and A is 1. The penalty for unsatisfied demand will be $5 per 
speaker. Using the results just obtained for this distribution, with the iterative pro- 
cedure step 1 leads to (setting p equal to 0) 


/2(8,000)(12,000) 
ie SN Se Nd Sead 298. 
Q 0.30 ABETS 


Using this value of Q, solve for s; that is, 


_ (0.30)(25,298) 


s = 16,000 |: (5)(8,000) 


| = 12,964. 
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Using this value of s, solve for Q; that is, 











5(12,964)? 


16,000 | = 2512.960 } 


(8,000) {2112,000 + (5)(16,000) + | 





0.30 


26,773. 

Using this value of Q, solve for s; that is, 

(0.30)(26,773) 
(5)(8,000) 


Three more iterations lead to (Q = 26,945, s = 12,767), (Q = 26,965, s = 12,764), 
and finally, to (Q* = 26,968, s* = 12,764). The probability of a stockout is given 
by 





s = 16,000 [ | = 12,787. 


h 3g 
Pip > sẹ} = Č = 0.20. 
pa 


Consider this same example, but now assume that the distribution of demand 
is exponential with mean 8,000. Step 1 in the iterative procedure again leads to 
Q = 25,298. Using this value of Q, solve for s; that is, 


(0.30)(25,298) 

— ~ = 13,297. 
(5)(8,000) 

Using this value of s, solve for Q; that is, 


/2(8,000 
Q= o [12,000 + (8,000)(S)e~ 03290/8-000] = 32,323. 


Several further iterations lead to the final solution of (Q* = 34,532, s* = 10,808). 
The probability of a stockout is given by 


ho* 
P{D > s*} = BOE 0.26. 
pa 


s = —8,000 In 





CONTINUOUS REVIEW MODEL WITH FIXED DELIVERY LAG—NO BACKLOGGING: 
This model is similar to the previous model except that unsatisfied demand will be 
assumed to be lost, i.e., unsatisfied demand will not be backlogged, and. now lost 
revenue will be included in the unsatisfied demand penalty cost. The derivation of the 
costs contains the same assumptions that were made in the backlogging case, so the 
subsequent expressions will lead to approximate results. The expected ordering cost 
per unit of time, E(OC), and the expected shortage cost per unit of time, E(SC), are 
the same for both models. The only cost that differs is the expected holding cost per 
unit of time, E(HC), and the key to its determination is to note that the average 
amount of inventory held during the cycle differs by the expected number of shortages 
that occur during the lag period A; i.e., the no-backlogging case has an expected 
inventory level per cycle greater than the backlogging cost by an amount equal to the 
expected number of shortages, which is given by f? (É — s)@p(€) dé. Therefore, the 
expected holding cost per unit of time, E(HC), can be written as 


E(HC) =h (2) +s— at Í (E — s)¢p(€) a| 


The expected total cost per unit of time, C(Q, s), can be expressed as 731 


C(Q, s) = o +acth (2) HS GA + Í (E — s)epf€é) ag! 
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Because there are two decision variables (Q and s), the optimum values (Q* and s*) 
are found by minimizing C(Q, s) with respect to the variables Q and s, that is, setting 
the partial derivatives 8C(Q, s)/ðQ and dC(Q, s)/ðs equal to zero. Thus, 


Inventory Theory 


pa | [ E- 9e0@ ag] 








aC 8) _ aK h 2 
a Ë 2 Q? 7 

re : rel Í Polé) ae] 

= S af pp(é) dé — D pe 


Solving these equations simultaneously! leads to 


faa [x +p Í a (E — Sel) ae|} 


(3) Q* = A ; 
$Q 
hQ* + pa 








o] 


(4) f . Polé) dé = 


Unfortunately, solving these equations simultaneously and obtaining a general 
closed-form expression for Q* and s* is not possible, so an iterative procedure is 
desirable. Such a procedure is as follows. 


1. As an initial step, assume that p equals zero and obtain a value of Q from 
Eq. (3). Note that this equation is just the expression for Q in the determin- 
istic economic lot-size model when shortages are not permitted. 

2. Solve for s in Eq. (4) using the value of Q found in step 1. 

3. Using the value of s found in step 2, solve for a new Q using Eq. (3). 

4. Repeat steps 2 and 3 until successive values of Q and s are sufficiently close. 


In practice, this procedure will generally converge in just a few iterations. 


MULTIPRODUCT INVENTORY MODELS: The previous sections dealt with inventory 
systems for single-product models, but most real inventory systems involve many 
products with various types of interactions, such as joint storage and budget limitations 
and product substitutability. One important reason for studying single-product models 
first is that they provide insight into solving multiproduct problems, and furthermore, 
it is often possible to ‘‘factor’’ an N-product problem into N one-product problems 
without loss of optimality. (This factoring can be done if the demand and cost for 


! Note that the optimum values of Q and s are independent of c, the unit cost of the items ordered. A little 
reflection indicates the total number of units ordered is independent of the values of Q and s, and, therefore, 
can be neglected in determining the optimum values of these parameters. 
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each product can be treated independently of the other products.) There has been some 
work on multiproduct models in which such factorization is not possible. In these 
models, stocks of a single product at different locations or echelons of a supply system 
can also be conveniently viewed as stocks of different products. A multiproduct model 
proposed by A. F. Veinott, Jr.,” is a direct analog of the single-product multiperiod 
inventory model with no setup cost as presented earlier. This multiproduct model 
considers N products and m different classes of demands for these products. The 
demand classes in a period may be classified by such characteristics as time of oc- 
currence, essentiality, products desired, and acceptable substitutes. Under the usual 
restrictions on the form of the costs, the optimal policy is a single critical number 
policy for each product; i.e., if x, is the amount on hand of product k at period i, 
order up to yg if X < Y; otherwise, do not order (assuming initially that x,, = Yig 
for all k). 

This model has many useful applications. For example, suppose there are two 
products, with product 1 serving as a substitute for product 2, and all unsatisfied 
demands are lost. If we choose for this model a stocking policy that supplies unsatisfied 
demand for product 2 with excess stock from product 1 (if available), we can obtain 
the optimal critical numbers. This example can be interpreted as a two-echelon in- 
ventory model with demands for a single product that cannot be satisfied at echelon 
2 being transmitted up to echelon 1. 

A second application concerns two products that serve as substitutes for each 
other. Again, all unsatisfied demands are assumed to be lost. The stocking policy 
supplies the unsatisfied demand for one product with the excess stock from the other 
(if available). This example can be interpreted as a two-location inventory model with 
an end-of-period redistribution of excess stock (if any) at one location to satisfy a 
shortage (if one exists) at the other location. 

A different variation of the two-product inventory models was presented by 
D. Iglehart.* Inventories of product 2 are maintained to provide capability for pro- 
duction of product 1. For example, if product 1 is a car, product 2 might be machinery 
or labor. An optimal policy describes the amount of product 1 and product 2 that 
must be produced in period i to minimize the total cost, subject to certain constraints. 


18.5 Conclusions 


The inventory models presented here are rather simplified, but they serve the purpose 
of introducing the general nature of inventory models. Furthermore,. they are suffi- 
ciently accurate representations. of many actual inventory situations so that they fre- 
quently are useful in practice. For example, the economic lot-size formulas have been 
particularly widely used, although they are sometimes modified to include some type 
of stochastic demand. The multiperiod models with stochastic demand have been 
important in characterizing the types of policies to follow, for example, (s, 5) policies, 
even though the optimal values of s and § are difficult to obtain. Nevertheless, many 


1 An excellent summary is given in a paper by A. F. Veinott, Jr.: ‘“The Status of Mathematical Inventory 
Theory, Management Science, 12(11):745-777, 1966. 

? Veinott, A. F., Jr.: ‘‘Optimal Policy for a Multi-Product, Dynamic Non-Stationary Inventory Problem,”’ 
Management Science, 12(3):206-222, 1965. 

3 Tglehart, D.: ‘‘Capital Accumulation and Production for the Firm: Optimal Dynamic Policies,” Manage- 
ment Science, 12(3):193-205, 1965. 


inventory situations possess complications that must still be taken into account, e.g., 
interaction between products. Several complex models have been formulated in an 
attempt to fit such situations, but they still leave a wide gap between practice and 
theory. Continued growth is occurring in the computerization of inventory data 
processing, along with accompanying growth in scientific inventory management. 
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PROBLEMS 


1.* Suppose that the demand for a product is 30 units per month, and the items are 
withdrawn uniformly. The setup cost each time a production run is made is $15. The production 
cost is $1 per item, and the inventory holding cost is $0.30 per item per month. 

(a) Assuming shortages are not allowed, determine how often to make a production run 

and what size it should be. 

(b) If shortages cost $3 per item per month, determine how often to make a production 

run and what size it should be. 


2. The demand for a product is 600 units per week, and the items are withdrawn 
uniformly. The items are ordered, and the setup cost is $25. The unit cost of each item is $3, 
and the inventory holding cost is $0.05 per item per week. 

(a) Assuming shortages are not allowed, determine how often to order and what size 

the order should be. 

(b) If shortages cost $2 per item per week, determine how often to order and what size 

the order should be. 


3. Solve Prob. 2 with shortages permitted and assume a delivery lag of 1 week. 


4.* Solve the economic lot-size model problem presented in Sec. 18.3 when shortages 
are permitted but when the cost is $5 per speaker. 


733 


Inventory Theory 


734 
Probabilistic Models 


5. A taxi company uses gasoline at the rate of 8,500 gallons/month. The gasoline costs 

$1.05/gallon, with a setup cost of $1,000. The-inventory holding cost is 1 cent/gallon/month. 

(a) Assuming shortages are not allowed, determine how often and how much to order. 

(b) If shortages cost 50. cents/gallon/month, determine how often and. how much to 
order. 


6. Solve Prob. 5(a) by assuming that the cost of gasoline drops to $1.00/gallon if at 
least 50,000 gallons are purchased. 


7. Solve Prob. 5(a) if the cost of gasoline is $1.20/gallon for the first 20,000 gallons 
purchased, $1.10 for the next 20,000 gallons, and $1.00/gallon thereafter. 


8. Consider the economic lot-size problem with shortages permitted, as presented in 
Sec. 18.3. Suppose, however, that it has been determined that S/Q = 0.8. Derive the expression 
for the optimal value of Q. 


9. In the economic lot-size model, suppose the stock is replenished uniformly (rather 
than instantaneously) at the rate of b items per unit time until the lot size Q is fulfilled. 
Withdrawals from the inventory are made at the rate of a items per unit time, where a < b. 
Replenishments and withdrawals of the inventory are made simultaneously. For example, if Q 
is 60, b is 3 per day, and a is 2 per day, then 3 units of stock arrive each day until day 20. 
Units are withdrawn at the rate of 2 per day for 30 days, at which time 3 units of stock arrive 
each day for 20 days, etc. The diagram of inventory versus time is given below. 


Inventory level 






Point of maximum inventory 


0,0 


M Time 
\ / (days) 


(a) Find the total cost per unit time in terms of the setup cost K, production number Q, 
unit cost c, holding cost h, withdrawal rate a, and replenishment rate b. (Hint: In 
determining the average inventory level, divide the cycle into the two intervals 
I and II shown in the diagram. Then find the average inventory level in each interval.) 

(b) Determine the economic lot size. 


10. Suppose the requirement for the next 5 months is given by r) = 2, m = 4, 
ra = 2, r, = 2, and r; = 3. Items are ordered; the setup cost is $4, the purchase cost is $1, 
and the holding cost is $0.30. Determine the optimal production schedule that satisfies the 
monthly requirements. Use dynamic programming. 


11. Solve Prob. 10 by assuming that the production costs are given by $3 (1 + log,X), 
where X is the amount produced in a month. 


12. Solve Prob. 10 by using the algorithm presented in Sec. 18.3. 
13. Solve Prob. 11 by using the algorithm presented in Sec. 18.3. 
14. Formulate Prob. 10 as an integer programming problem. 


15.* Solve the production planning model for the production of speakers when the 
requirements are increased by 1 unit in each period. 


16. Solve the production planning model for the production of speakers when the unit 
costs during the first and third periods are increased to $1.40. Use dynamic programming. 


17. Develop an algorithm to solve the production planning model that uses forward 
induction. (Hint: If an optimal policy is being followed and if the time of the last production 
is known for this optimal policy, the total cost of the previous periods must be a minimum for 
this reduced problem because the overall policy is optimal.) 


18.* Consider a situation where a particular product is produced and placed in in-process 
inventory until it is needed in a subsequent production process. The number of units required 
in each of the next 3 months, as well as the setup cost and the regular-time unit production 
cost that would be incurred in each month; are 






Regular-Time 











Month Setup Cost ($) | Unit Cost ($) 
1 5 8 
2 10 10 


5 9 


There currently is 1 unit in inventory, and we want to have 2 units in inventory at the end of 
the 3 months. A maximum of 3 units can be produced on regular-time production in each 
month, although one additional unit can be produced on overtime at a cost that is $2 larger 
than the regular-time unit production cost. The cost of storage is $2 per unit for each extra 
month that it is stored. 

Use dynamic programming to determine how many units should be produced in each 
month to minimize total cost. 


19. Consider a situation where a particular product is produced and placed in in-process 
inventory until it is needed in a subsequent production process. The number of units required 
in each of the next two months, as well as the setup cost, holding cost (charged as a function 
of excess of supply over requirement and charged at the end of the period), and regular-time 
unit production cost, are as follows: 










Setup Cost ($) 


5 
5 





Holding Cost ($) 


0.30 
0.30 


Unit Cost ($) 













Determine the optimal production schedule that satisfies the monthly requirements. Use the 
algorithm presented in Sec. 18.3. 


20. A newspaper stand purchases newspapers for 18 cents and sells them for 25 cents. 
The shortage cost is 25 cents per newspaper (because the dealer buys papers at retail price to 
satisfy shortages). The holding cost is 0.1 cent. The demand distribution is a uniform distribution 
between 200 and 300. Find the optimal number of papers to buy. 


21. Suppose the demand D for a spare airplane part has an exponential distribution with 
parameter 39; that is, 


1, —é/50 
= 50€ š é 20 
Po(8) = t otherwise. 


This airplane will be obsolete in 1 year, hence all production is to take place at the present 
time. The production costs now are $1,000 per item—that is, c = 1,000—but they become 
$10,000 per item if they must be supplied at later dates—that is, p = 10,000. The holding 
costs, charged on the excess after the end of the period, are $300 per item. 
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(a) Determine the required number of spare parts. 

(b) Suppose that the manufacturer has 23 parts already in inventory (from a similar, but 
now obsolete airplane). Determine the optimal inventory policy. 

(c) Suppose that p cannot be determined now, but the manufacturer wishes to order a 
quantity so that the probability of a shortage equals 0.05. How many units should 
be ordered? 

(d) If the manufacturer were following an optimal policy, but ordered the quantity in 
part (b), what is the implied value of p? 


22.* A bread manufacturer distributes bread to grocery stores daily. The cost of the 
bread is 40 cents per loaf. The company sells the bread to the stores for 60 cents per loaf sold, 
provided that it is disposed of as fresh bread (sold on the day it is baked). Bread not sold is 
returned to the company. The company has a store outlet that sells bread that is a day or more 
old for 30 cents per loaf. This salvage cost represents the holding cost. The unsatisfied demand 
cost is estimated to be 40 cents per loaf. If the demand has a uniform distribution between 
1,000 and 2,000 loaves, find the optimal daily number of loaves that the manufacturer should 
produce. 


23. A student majoring in operations research enjoys optimizing his personal decisions. 
He is analyzing one such decision currently, namely, how much money to take out of his 
savings account (if any) to buy travelers’ checks before leaving on a summer vacation trip to 
Europe. 

He already has used the money he had in his checking account to buy travelers’ checks 
worth $1,200, but this may not be enough. In fact, he has estimated the probability distribution 
of what he will need as shown in the following table: 








Amount needed ($) 1,000 1,100 1,200 1,300 1,400 1,500 1,600 1,700 


Probability 0.05 0.10 0.15 0.25 0.20 0.10 0.10 








If he turns out to have less than he needs, then he will have to leave Europe 1 week early for 
every $100 short. Because he places a value of $150 on each week in Europe, each week lost 
would thereby represent a net imputed loss of $50 to him. However, every $100 travelers’ 
check costs an extra $1. Furthermore, each such check left over at the end of the trip (which 
would be redeposited in the savings account) represents a loss of $2 in interest that could have 
been earned in the savings account during the trip, so he does not want to purchase too many. 

Using these data, determine the optimal decision on how many additional $100 travelers’ 
checks (if any) the student should purchase from his savings account money. 


24.* Find the optimal ordering policy for a one-period model, where the demand has a 
probability density 
a, if0<é<20 


0, otherwise, 


PHE) = { 


and the costs are 


Holding = $1 per item, 

Shortage = $3 per item, 
Setup = $1.50. 

Production = $2 per item. 


25. The campus bookstore must decide how many textbooks to order for a course that 
will be offered only once. The number of students who will take the course is a random variable 
D, whose distribution can be approximated by a (continuous) uniform distribution on the interval 


[40, 60]. After the quarter starts, the value of D becomes known. If D exceeds the number of 
books available, the known shortfall is made up by placing a rush order at a cost of $14 plus 
$2 per book over the normal ordering cost. If D is less than the stock on hand, the extra books 
are returned for their original ordering cost less $1 each. What is the order quantity that 
minimizes the expected cost? 


26. Consider the following inventory model, which is a single-period model with known 
density of demand ¢,(€) = e7 £, for € > 0 and zero elsewhere. There are two costs connected 
with the model: The first is the purchase cost, given by c - (y — x); the second is the unsatisfied 
demand cost, which is just a constant, p (independent of the amount of unsatisfied demand). 

(a) If x units are available and goods are ordered up to y, write the expression for the 

expected loss, and describe completely the optimal policy. 

(b) Ifa fixed cost K is also incurred whenever an order is placed, describe the optimal 

policy. os 

27. Using the approximation for finding the optimal policy for a single-period model 
when the density of demand has an exponential distribution, find this policy when 

ge” 15, if€é=0 
Ppl) = { 


0, otherwise, 


and the costs are 


Holding = 40 cents per item, 
Shortage = $1.50 per item, 
Purchase price = $1 per item, 
Setup = $10. 


28.* There are production processes for which the difference between the cost of pro- 
ducing the maximum number of units allowed by some capacity restriction and the cost of 
producing any number of units less than this maximum is negligible; i.e., ordering is by batches. 
Consider a one-stage model, where the only two costs are holding costs given by 


h(y — D) = ñ (y — D), 
and the penalty cost of unsatisfied demand given by 
pD = y) = 2.5(D — y). 
The density function for demand is given by 
e7 E25 


Plé) 5 
0, otherwise. 





fréz=0 


If you order, you must order in batches of 100 units, and this quantity is delivered instanta- 
neously. Thus, if x denotes the quantity on hand, and if you do not order, then y = x. If you 
order one batch, then y = x + 100. Let G(y) denote the total expected cost of this inventory 
problem when there are y units available for the period (after you have ordered). 

(a) Write the expression for G(y). 

(b) What is the optimal ordering policy? 


29. Consider the following inventory situation. Demands are independent with common 
density given by the following: 


e7 E5 


Polé) = 4 25 
0, otherwise. 





fréz 0 
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Orders may. be placed at the start of each period without setup ‘cost at a price of c = 10. There 
are a holding cost of 6 per unit remaining in stock at the end of each period and a penalty cost 
of 15 per unit quantity backlogged. 

(a) Find the optimal one-period policy. 

(b) Find the optimal two-period policy. 


30. Consider the following inventory situation. Demands are independent with common 
density plé) = s5, 0 < Æ< 50. Orders may be placed at the start of each period without 
setup cost at a price of c = 10. There are a holding cost of 8 per unit remaining in stock at 
the end of each period and a penalty cost of 15 per unit quantity backlogged. 

(a) Find the optimal one-period policy. 

(b) Find the optimal two-period policy. 


31.* Find the optimal inventory policy for the following two-period model by using a 
discount factor of a = 0.9. Let the density of the demand D be given by 


1 


—e FS, if E=S0 
Polé) = 25 
0, otherwise, 
and the costs are 
Holding = 25 cents per item, 
Shortage = $2 per item. 


Purchase price = $1 per item. 


Stock left over at the end of the final period is salvaged for $1 per item, and shortages remaining 
at this time are also available at $1 per item. 


32. Solve Prob. 31 for a two-period model assuming no salvage value, no backlogging 
at the end of the second period, and no discounting. 


33. Solve Prob. 31 for an infinite-period model by using a discount factor of a = 0.90. 


34.* Determine the optimum inventory policy when the goods are to be ordered at the 
end of every month from now on. The cost of ordering up to y when x is available is given by 
2(y — x). Similarly, the cost of not satisfying a consumer demand of D is given by 5(D — y). 
The density function for the random variable, demand. is given by ¢,(€) = e~®. The storage 
costs are given by (y — D) and represent the expense of storing unsold stock. The losses at 
each succeeding stage are equivalent to a loss of 95 percent of that at the previous stage. 


35. Solve the inventory problem given in Prob. 34 but assuming that the policy is to be 
used for only one year (a 12-stage model). Shortages are backlogged each month, except that 
any shortages remaining at the end of the year are made up by purchasing similar items at a 
unit cost of $2. Any remaining inventory at the end of the year can be sold at a unit price of 
$2. 

36. A supplier of high-fidelity receiver kits is interested in using an optimal inventory 
policy. The distribution of demand per month is uniform between 2,000 and 3,000 kits. The 
cost of each kit is $150. The holding cost is estimated to-be $2 per kit per month, and the 
unsatisfied demand cost is $30 per kit per month. Using a discount factor of a = 0.90, find 
the optimal inventory policy for this ‘‘infinite’’-horizon problem. 


37. The weekly demand for a certain type of electronic calculator is estimated to be 
given by 


1 .-sn.o00 E=0 


P(E) = 4 1,000 
0, otherwise. 


The unit cost of these calculators is $8. The holding cost is 7 cents per calculator per week. 
The unsatisfied demand cost is $2 per calculator per week. Using a discount factor of a = 
0.95, find the optimal inventory policy for this infinite-horizon problem. 


38. Consider an infinite-period inventory model with no setup cost and nonlinear penalty 
and holding costs. The demands in each period are independent and are exponentially distributed 
with mean |. The penalty cost is exponential; if demand exceeds supply by z units, then the 
penalty cost is exp(#z). There is a fixed holding cost of 3; that is, if supply exceeds demand 
by any amount, the cost is 3. The ordering cost is 1 per unit, and the discount factor is 0.95. 
Find the optimal ordering policy. 


39.* Consider an infinite-period inventory model in which the demands are independent, 
identically distributed random variables. Denote the expected demand in a period by u. Assume 
the cost of ordering z units (z = 0) isc + z (c > 0). Let a (0 < œ < 1) be the discount factor. 
Assume that all unsatisfied demand is backlogged. Finally, suppose that when y is the inventory 
on hand after ordering but before the occurrence of a demand of size D in a period, a cost 
(y — DY is incurred. When y > D, the cost is a charge for carrying the inventory; when 
y < D, the cost is a charge for backlogging demand. Describe the optimal ordering policy, and 
give simple formulas for its parameters in terms of c, a, and u. 


40. Find the optimal (k. Q) policy for Prob. 28 for an infinite-period model by using a 
discount factor of a = 0.90. 


41. For the case of linear shortage and holding costs, unit costs of p and h, respectively, 
show that the value of y° that satisfies 


p — c(l — a) 
oy) = = 
y) DER 
is equivalent to the value of y that satisfies 
dL 
OD aa 
dy 


where L(y), the expected shortage plus holding cost, is given by 


æ 


LO) = [ PE - DPE) dë + | My — DoE dE 


42.* Solve Prob. 1(b) assuming that the demand is random with uniform distribution 
over the interval (0, 30), unsatisfied demand is backlogged, and the delivery lead time is 
4 month (2 weeks). 


43. Solve Prob. 2(b) assuming that the demand is random with an exponential distri- 
bution with mean 600, unsatisfied demand is backlogged, and the delivery lead time is 1 week. 


44. Solve Prob. 5(b) assuming that the demand is random with uniform distribution 
over the interval (0, 4,250), unsatisfied demand is backlogged, and the delivery lead time is 
4 month (1 week). 


45. Solve Prob. 5(b) assuming that the demand is random with an exponential distri- 
bution with mean 2,125, unsatisfied demand is backlogged, and the delivery lead time is + month 
(1 week). 


46. Consider the continuous review model with fixed delivery lag—-backlogging. given 
on p. 724 of the text. Supposing that the order quantity Q is fixed in advance, derive the 
expression for the optimal reorder point, s*, when the distribution of demand is exponential 
with mean aA. 


47. Suppose that the criterion of minimizing the expected total inventory cost per unit 
time subject to meeting a service level is chosen for the continuous review model with fixed 
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740 delivery: lag —backlogging. That is, the decision maker wishes to choose an ‘‘optimal’’. policy 
Probabilistic Models subject to being assured that the probability of a shortage occurring during a cycle will not 
exceed V, or 
P{D > s} SV. 


(a) Assuming that the density function of demand is uniform over the range from 0 to 
t, and that the inventory policy is to place an order of size Q whenever the inventory 
position reaches s, find the relation between s and V. 

(b) Using the smallest value of s in part (a), find the expression for C(Q, s), the expected 
total cost per unit of time, in terms of the only remaining variable Q. Call this cost 


C(Q). 


(c) Show that optimal order size Q is given by 


_ [2a pty? 
Q 7 (x+ "a ) 


48. Solve Prob. 47 when the density function of demand is exponential, or 


1 
Q,(€) = (2) e7/%, for E> 0. 


Q= a (K + adpV). 


49. Consider the continuous review model with fixed delivery lag—no backlogging. 
Suppose the density function of demand is uniform over the range from 0 to t. Show that Eq. 


(4) becomes 
pa 
ao t | -——_ 
i ia + wa) 


bv fe + apt + aps*?/t — 2aps*) 
= : 


For part (c), show that 


and Eq. (3) becomes 








50. Consider the continuous review model with fixed delivery lag—no backlogging. 
Suppose the density function of demand is exponential. Show that Eq. (4) becomes 


ho* 
$ = w TEEF DA. O 
s am (m). 


2 
O* = fe (K + adpe~°"/*), 


and Eg. (3) becomes 
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19.1 Introduction 


Chapter 18, ‘‘Inventory Theory,’’ was concerned with finding optimal inventory pol- 
icies. These policies were derived from inventory models and are, in part, dependent 
upon some forecast of sales or use of the items of interest. Forecasting is an essential 
component of any successful inventory system. However, forecasting need not be 
associated solely with problems of inventory control. Other examples where fore- 
casting plays an important role in industry include marketing, financial planning, and 
production. Indeed, it is extremely rate to think of examples where managerial de- 
cisions are made in the absence of some form of forecasting. We should emphasize 
that a forecast is not the final product itself; it is to be used as a tool in making a 
managerial decision. 

Forecasts can be obtained by using qualitative or quantitative techniques. In the 
former case, a forecast is usually the result of an expression of one or more experts’ 
personal judgment or opinion, and it is often called a judgmental technique. For 
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example, a major research university calls in its leading economists every September 
to obtain their judgment on what to expect as an inflation rate for the next academic 
year—a number crucial to the budgeting process. This number is generally arrived at 
by consensus after prolonged discussion by the economists. 

Two distinct quantitative techniques are used in forecasting. both of which are 
conventional statistical techniques, i.e., time series analysis and regression analysis. 
A Statistical time series is simply a series of numerical values that a random variable 
takes on over a period of time. For example, the daily market closing prices of a 
particular stock over the period of a year constitute a time series. Time series analysis 
exploits techniques that utilize these data for forecasting the values that the variable 
of interest will take on in a future period. For example, the West Coast distributor of 
10-speed bicycles (Chap. 18) wants to make quarterly sales forecasts for planning 
purposes. He has data on sales during previous quarters; i.e., he has the values that 
the random variable quarterly sales take on, and he wants to forecast the sales. that 
will occur during the next quarter (or subsequent quarters). Forecasting, in general, 
is concerned with an analysis of past time series data in order to estimate one or more 
future values of the time series. The forecast depends upon a model of behavior of 
the time series. 

In regression analysis,! the variable to be forecast (the dependent variable) is 
expressed as a mathematical function of other (independent) variables. For example, 
forecasting the total sales of a textbook in a given period may be functionally related 
to the mail order sales during the same period. Data on mail order sales and total 
sales over previous periods may be used to forecast total sales in a future period given 
the mail order sales for that period. 

The three types of forecasts alluded to may be used in conjunction with one 
another. Indeed, the judgmental technique is often used together with an appropriate 
time series analysis. 


19.2 Judgmental Techniques 


Judgmental techniques are, by their very nature, subjective, and they may involve 
such qualities as intuition, expert opinion, and experience. They generally lead to 
forecasts that are based upon qualitative criteria. One commonly used technique is to 
bring together a group of experts who interact with each other and produce a consensus 
forecast. The example of the group of university economists who were asked to 
forecast the inflation rate is an example of the expert group technique. 

Perhaps the most important judgmental technique is called the Delphi method. 
Like the expert group technique, the Delphi method utilizes a group of experts (not 
in a meeting setting). In addition to this group, there are one or more decision makers 
who ultimately are responsible for making the forecast. Finally, there is a staff who 
perform the duties associated with the method. These duties include the preparation 
of questionnaires and the analysis of their results. 

The Delphi. method first utilizes a- questionnaire: that is sent to the panel of 
experts. and then analyzed. Based upon the results of the first questionnaire, a second 
questionnaire is developed and sent to the same panel of experts, together with the 
results of the first questionnaire. This. second. questionnaire is then completed by the 


! Regression is simply an expression of the form of the function of the expected value of the dependent 
variable, given the values of the independent variables. 


panel of experts and returned for analysis. Based upon the results of the two ques- 
tionnaires, and using their own expertise, the decision makers ultimately come forth 
with a forecast. The key to the Delphi method is the feedback of the information 
contained in the first questionnaire to the panel of experts. Thus each member of the 
panel has access to information that he or she may have lacked originally, so that 
each member of the panel has all the same information when completing the second 
questionnaire. 

Of course, the success of the Delphi method hinges on the quality of the design 
of the questionnaires. Occasionally, more than two iterations may be used if they are 
deemed desirable. This situation occurs when there appears to be sufficient divergence 
in the first two questionnaires to warrant a third round in the hope that the feedback 
from the results of the second round will lead to more convergence in the third. 


19.3 Time Series 


A time series can be viewed as the representation of the outcomes of a random variable 
of concern over a fixed period of time, usually taken at equally spaced intervals. The 
daily closing prices of AT&T stock taken over the last year comprise a time series. 
The quarterly unemployment rate from January 1980 to July 1984 comprises a time 
series. This time series is shown in Fig. 19.1. The quarterly sales of the West Coast 
distributor of 10-speed bicycles over the last three quarters also comprise a time series. 
The behavior of a time series can be displayed in graphical form, bar graphical form, 
or tabular form, where the first method is generally most descriptive of the pattern of 
behavior of the series. 

Because a time series is a description of the past, a logical procedure for fore- 
casting the future is to make use of this historical data. If history is to repeat itself— 
i.e., if the past data are indicative of what we can expect in the future—we can 
postulate an underlying mathematical model that is representative of the process. 
Indeed, if this model is known, except possibly for certain parameters, we can generate 
forecasts. Alternatively, if the model is not known, the past data may be suggestive 
of its form. 

In most realistic situations, knowledge of the exact form of the model that 
generates the time series is unknown: Frequently, a model is chosen by observing the 
outcomes of the time series over a period of time. Several typical time series patterns 
are shown in Fig. 19.2. Figure 19.2a shows a time series that might be observed. if 
the generating process were represented by a constant level superimposed with random 
fluctuations. Figure 19.2b shows a time series that might be observed if the generating 
process were represented by a linear trend superimposed with random fluctuations. 
Finally, Fig. 19.2c shows a time series that might be observed if the generating process 
were represented by a constant level superimposed with a seasonal effect together 
with random fluctuations. There are many other plausible representations, but these 
three are very useful in practice and will therefore be considered in this chapter. 

Once the form of the model is chosen, an appropriate mathematical model of 
the generating process of the time series is identified, except possibly for the unknown 
values of parameters. For example, suppose that the generating process of the time 
series is identified as a constant level model superimposed with random fluctuations, 
that is, Fig. 19.2a. Such a representation can be given by 


X,=At 4, 
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Figure 19.2 Typical time series patterns. 


where X, is the random variable that is observed at time t, A is the “‘constant value’? 
of the model, and e, is the random error occurring at time ¢ (frequently assumed to 
have expected value equal to zero and constant variance). Let F,, , denote the forecast 
of the value of the time series at time £ + 1. It is reasonable to expect that F, will 
be a function of some, or all, of the observed values of the time series prior to time 
t+ 1. 


19.4 Forecasting Techniques for Constant Level Models 


An appropriate model is that given at the end of Sec. 19.3, i.e., 
X,=Ateé, 


where X, is the random variable that is observed at time ¢, A is the ‘‘constant value’? 
of the model, and e, is the random error occurring at time ¢ (frequently assumed to 
have expected value equal to zero and constant variance). The following are four 
techniques that are often used in practice. 


Last Value Forecasting Procedure 


Denote by x, the value the random variable X, takes on. The last value forecasting 
procedure is to assume that the forecast at time £ + 1, that is, F,,,, equals the value 
of the time series observed at time 7, that is, x, Therefore, 


Fai = %; 


For example, the West Coast distributor of 10-speed bicycles may make quar- 
terly sales forecasts for planning purposes. He has data on sales during previous 
quarters; i.e., he has the values the random variable quarterly sales take on, and he 
would like to forecast the sales that will occur in the forthcoming quarter or in sub- 
sequent quarters. 

Using the last value forecasting procedure, the bicycle distributor uses the sales 
from last quarter as the forecast of the sales for future quarters. This forecasting 
procedure has the disadvantage of being imprecise; i.e., its variance is large because 
it is based upon a sample of size 1. It is worth considering only if the underlying 
assumption about the ‘‘constant level’ model is “‘shaky,’’ and the process is changing 
so rapidly that anything before time ¢ is almost irrelevant or misleading; and/or the 
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assumption that constant variance is unreasonable, and the conditional variance at time 
tis very small. This simple technique using past data is frequently compared with the 
results of using a more sophisticated technique to determine whether or not the so- 
phisticated technique, which is generally more cumbersome, is of any value. 


Average Forecasting Procedure 


The bicycle distributor may use all his past quarterly data to forecast the sales for 
future quarters; i.e., he may choose 


t 
F1 = 2 
i= 


Ra 


i 


~ 


This estimate is an excellent one if the process is entirely stable, i.e., assumptions 
about the underlying. model are correct. However, besides being too cumbersome 
when using large masses of data, one does not want to use data that are too old 
because there always exists skepticism about the persistance of the underlying model 
for too long a period. 


Moving Average Forecasting Procedure 


In using a moving average estimate, the bicycle distributor uses only the last n periods; 
that is, : 


t 
Fa = > 
t-nt+l 


i= 


JE 


Not only does this forecasting technique use all the relevant history in the last n 
periods, but it is easily updated from period to period; i.e., the first observation is 
lopped off and the last one is added. The moving average estimator combines the 
advantages of the previous estimators in that it uses only recent history and represents 
multiple observations. A disadvantage of this procedure is that it places as much 
weight on x,_,,,1 aS on x, and intuitively one would expect a good procedure to place 
more weight on the most recent observation. 


Exponential Smoothing Forecasting Procedure 
If the bicycle distributor uses exponential smoothing, then 
F1 = ax, + (1 — oF, 


where 0 < a < 1 is called the smoothing constant. Thus the forecast is just a weighted 
sum of the last observation. and the previous forecast. .The choice of œ is discussed 
later. Note that the exponential smoothing technique represents a recursive relationship 
and can be expressed alternatively as 


Fu, = ax, + a(l — ax, + al — ax, o 


In this form, it becomes evident that exponential smoothing gives the most weight to 
x, and decreasing weights to earlier observations. Furthermore, the first form reveals 


that the forecast is simple to calculate because the data prior to period t need not be 
retained; all that is required is x, and the previous forecast F,. Another alternative form 
for the exponential smoothing technique is given by 


Fray = F, + a(x, — F), 


which gives a heuristic justification for this procedure. Thus, the forecast of the time 
series at time £ + 1 is just the previous forecast at time ¢ plus the product of the 
forecasting error at time ¢ and the discount factor. This alternative form is often simpler 
to use. Finally, a measure of effectiveness of exponential smoothing can be obtained 
under the assumption that the process is completely stable; that is, X,, Xj, .. . are 
independent, identically distributed random variables with variance o°. It then follows 
that (for large t) 


ao a? 


2-a Q- a/a 





var[F 41] ~ 


so that the variance is statistically equivalent to a moving average with (2 — a@)/a 
observations. If œ is chosen equal to 0.1, then (2 — a)/a = 19. Thus the exponential 
smoothing technique is ‘‘equivalent’’ to a moving average procedure that uses 19 
observations. However, it must be noted that when the aforementioned underlying 
assumptions are violated, exponential smoothing will react more quickly with superior 
“‘tracking.”’ 

An important drawback of exponential smoothing is that it lags behind a con- 
tinuing trend; i.e., if the ‘‘constant’’ level model is incorrect and the mean is increasing 
steadily, then the forecast will be several periods behind. However, the procedure can 
be easily adjusted for trend (and even seasonally adjusted). Another disadvantage of 
exponential smoothing is that it is difficult to choose an appropriate smoothing constant 
a. Exponential smoothing can be viewed as a statistical filter that inputs raw data 
from a stochastic process and outputs smoothed estimates of a mean that varies with 
time. If œ is chosen to be small, response to change is slow, with resultant smooth 
estimators. Similarly, if œ is chosen to be large, response to change is fast, with 
resultant large variability in the output. Hence there is a need to compromise, de- 
pending upon the stability of the process. Furthermore, a “‘good’’ value of the smooth- 
ing constant depends upon the underlying stochastic process and the choice of a 
criterion to use in comparing constants. It has been suggested that a should not exceed 
0.3 and that a reasonable choice for a is approximately 0.1. Of course it can be 
increased, perhaps temporarily, if an unusual change is expected or when starting. 
When starting, a reasonable approach is to choose the forecast for period 2 according 
to 


F, = ax, + (1 — a)(initial estimate), 


where some initial estimate of the constant level A must be chosen. If past data are 
available, such an estimate may be the ‘‘average’’ of these data. In the case of past 
history, another procedure for choosing a is to run a retrospective simulation of the 
process; i.e., for a fixed value of œ and using past history, compare the forecasted 
quantity with the actual outcome and choose that value of œ which is in some sense 
optimal. It is hoped that the process will behave in the future in the same manner as 
it has in the past. 
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19.5 A Forecasting Technique for Linear Trend Models 


As indicated earlier, the procedure outlined in the previous section lags behind a 
continuing trend. Suppose that the generating process of the observed time series can 
be represented by a linear trend superimposed with random fluctuations, and the linear 
trend has slope B. The slope is called the trend factor, and in the bicycle example, it 
represents the units per period that the expected sales rate increases or decreases. The 
model can be represented by 


X, = A+ Btt+ e, 


where X, is the random variable that is observed at time ¢, A is a constant, B is the 
trend factor, and e, is the random error occurring at time t (frequently assumed to 
have expected value equal to zero and constant variance). In the previous model 
(constant level), the forecast for period t + 1, based upon data from the previous 
periods, is the same as forecasts for periods t + 1 + m, form = 1,2,.... For 
linear trend models, such a statement no longer holds. Hence, rather than referring to 
forecasts immediately, the concept of a ‘‘smoothed’’ level will be introduced. If x, is 
the observed value of the time series at time t, then a “‘smoothed’’ level at time t, S,, 
will be a linear combination of x, and the. ‘‘smoothed’’ value at the preceding time 
period t — 1 corrected by adding the trend (slope) to indicate the passage of a unit 
of time; i.e., 
S = ax, + (1 — a(S,- + B). 

The forecast for time t+ 1 can now be obtained; i.e., 

Fi, = S, +B. 


Unfortunately, the trend (slope) B is unknown so it must be estimated, and exponential 
smoothing can again be used for this purpose; i.e., 


B, = BG, S) + (1 B)B,-1; 


where B, is the ‘‘smoothed’’ value of the trend at the end of period t, and0 < 6B < 1 
is another (possibly different from aœ) smoothing constant.! Hence, the smoothed level 
at time S, can now be expressed as 


S, = ax, + (1 — o\S,_, + B,_)), 





and the forecast for m periods ahead, m = 1,2,... , is given by 
Fy, = S + mB,. 
The forecasting procedures can now be summarized as follows. 


1. Using the observed value of the time series at the end of the rth period, x,, 
the smoothed level of the time series at time t — 1, S,_,, and the smoothed 
value of the trend at the end of period t — 1, B,_,, the smoothed level of 
the time series at time ¢ is given by 

S, = ax, + G — a)(S,_, + B,_)). 


2. From the smoothed level of the time series at time t, S, (calculated in step 
1), the smoothed level of the time series at time £ — 1, S,_,, and the 


1 The previous discussion concerning the choice of a is relevant to the choice of £. 


748 


smoothed value of the trend at the end of period t — 1, B,_,, the smoothed 749 
value of the trend at the end of period ¢ is given by 


B, = BS, — Si) + (1 — B)B,-1- 


Forecasting 


3. Forecast of the time series for m periods ahead, m = 1, 2, ..., is given 
by 
Fea, = S, + mB, 


Thus, in the context of forecasting sales of bicycles, the actual sales of bicycles, 
x,, is obtained at the end of period t; the smoothed level of the sales at the end of 
period ¢ — 1, S,_,, and the smoothed level of the trend at the end of period t — 1, 
B,_, are known so that S, can be obtained; the smoothed level of the trend at the end 
of period £ — 1, B,_,, and the smoothed level of sales at times t and t — 1, S, and 
S,—ı, respectively, are known so that B, can be obtained; and finally, the forecast of 
sales‘m periods ahead is then given by F,,,,. 

As in the case of exponential smoothing for a constant level model, an initial 
value is required to start the smoothing process for the linear trend model. This 
initialization is frequently obtained by fitting a straight line to some past data (using 
methods shown in Sec. 19.8). The fitted line can be used to obtain an initial value of 
the smoothed level of the time series, Sọ, and an initial value of the smoothed trend 
level, By. Thus, a 

S, = ax, + (1 — aSo + Bo), 
and B, = BIS; — So) + (1 — B)Bo. 
If a forecast for period 2 is desired, it can be obtained from 

F= fy S +B 

EXAMPLE: Suppose that quarterly sales for bicycles were 2,800, 2,925, and 3,040, 
respectively. Use exponential smoothing based upon the first three observations to 
forecast sales for the fifth period, using a = 6 = 0.1. From past data (prior to the 
three data points), a straight line was fit. The value on the line corresponding to the 


last observed time is 2,750, and the slope is 100. Thus Sy = 2,750 and Bọ = 100. 
The smoothed value of sales at the end of period 1 is given by 


Sı = 0.1(2,800) + 0.9(2,750 + 100) = 2,845. 
The smoothed value of the trend at the end of period 1 is given by 

B, = 0.1(2,845 — 2,750) + 0.9(100) = 99.5. 
Repeating this procedure for the second observation leads to 


S2 = 0.1(2,925) + 0.9(2,845 + 99.5) = 2,943 


and By = 0.1(2,943 — 2,845) + 0.9(99.5) = 99.4. 
Finally, the third observation results in 

S3 = 0. 1(3,040) + 0.9(2,943 + 99.4) = 3,042 
and B; = 0.1(3,042 — 2,943) + 0.9(99.4) = 99.4. 


Therefore, the forecast of sales for the fifth period is 
Fz} = Fs = 3,042 + 2(99.4) = 3,241. 
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If a forecast for the second period were desired (made at the end of the first 
period), it would result in 


F, = S, + B, = 2,845 + 99.5 = 2,945. | 


Similarly, if a forecast for the third period were desired (made at the end of the second 
period), it would result in 


F, = S, + B, = 2,943 + 99.4 = 3,042. 


19.6 A Forecasting Technique for Constant Level 
with Seasonal Effects Models 


In many important forecasting problems, there exist seasonal effects that must be 
accounted for in the model. Suppose that the generating process of the observed time 
series can be represented by a constant level superimposed with seasonal effects and 
random fluctuations. Such a model can be represented by 


X = AI He 


where X, is the random variable that is observed at time ft, A is a constant, J* is the 
seasonal index or factor for period t, and e, is the random error occurring at time t 
(frequently assumed to have expected value equal to zero and constant variance). 
Unfortunately, both A and J # are unknown, and ‘‘smoothed’’ levels at time t for both 
of these factors are useful prior to making a forecast. Exponential smoothing can again 
be used for this purpose; that is, 


S= (2+) + (L = o)S.2i, 


tp. 


I, 


Il 


Xx 
(2) + a = Whip 


and the forecast for the next period is given by 
Fyi” Sipe 


where p is the number of periods in the seasonal cycle—e.g., if the seasonal periods 
are autumn, winter, spring, summer, then p = 4, and0 < y< 1 is another smoothing 
constant. Note that I,_ A represents the ‘‘smoothed’”’ value of the seasonal index for 
period ¢ computed for the same season p periods ago; e.g., the seasonal index for the 
autumn of 1990 is based upon 1989 autumn data. 

The forecast for m periods ahead, m = 1, 2,..., is given by 


Pe = Srp 


If m is greater than p, J,_,.,, has not, as yet, been calculated. For this case, I, —p+m 
is to be interpreted as the last computed value of the corresponding seasonal index; 
e.g., if a forecast for the autumn of 1993 is desired, and no data later than the winter 
of 1990 are available, the seasonal index factor for autumn of 1990 is used in place 


of Laut 1992 


The technique for forecasting can be summarized as follows. 


1. Using the observed value of the time series at the end of the tth period, x,, 
and the smoothed value of the seasonal index for period t computed for the 
same season p periods ago, J,_,, the smoothed level of the time series at 
time f is given by 





S, = of ) + (1 — aS, 
t—p 
2. From the smoothed level of the time series at time period t, S, (calculated 
in step 1), the smoothed level of the time series at time period t — 1, S,_,, 
and the smoothed value of the seasonal index for period t computed for the 
same season p periods ago, /,_,,, the updated smoothed level of the seasonal 
index for period t is given by 


xX 
I, = (=) + = 2) Cone 
r 


3. Forecast of the time series for m periods ahead, m = 1, 2,..., is given 
by 


Fram = Ly -ptm 


As in the other exponential smoothing procedures, initial values are required to 
start the smoothing process. For models containing seasonal factors, a full cycle of 
seasonal data is required, as is suggested in the discussion about J,_,. A description 
of the initialization procedure is best given in the context of an example. Consider 
the bicycle example of Sec. 19.5, and suppose that bicycle sales are seasonal (autumn, 
winter, spring, summer). Upon examination of past data, the trend model has been 
found to be inappropriate and has been replaced by a seasonal model as described 
above. The data are given in columns 1 and 2 of Table 19.1. 

A reasonable way to initially estimate the seasonal factors is to divide last 
year’s quarterly sales by the quarterly average sales over the last year, that is, 
(2,786 + 2,928 + 3,025 + 3,061)/4 = 2,950. These values are shown in column 
3 of Table 19.1. The initial estimate of the constant level A is chosen to be the average 
of the four quarters over the past year, that is, Sọ = 2,950. Exponential smoothing 
is to be used to forecast the bicycle sales for autumn, winter, and spring of this year 
and, ultimately, for the summer quarter of this year. Use a = 0.1 and y = 0.2. 

The forecast for autumn is given by 


Fan = SoUoa an) = 2,950(0.944) = 2,785. 


Table 19.1 Bicycle Sales 





1 2 3 
Old (Last) Year New (This) Year Initial Seasonal Factor 
Autumn 2,786 2,800 0.944 
Winter 2,928 2,925 0.993 
Spring 3,025 3,040 1.025 
Summer 3,061 1.038 
Total 11,800 4.000 
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In order to obtain the forecast for the. winter quarter, the smoothed level of the time 
series, Saw» must be obtained. The smoothed level of the seasonal factor for next 
year’s autumn quarter forecast, [jew aut: 1S also obtained. 


SS o( Za ) + (1 — ùS 


old aut 
2,800 
= 0.1] —] + 0.9(2,950) = 2,952, 
(an) 0.9(2,950) = 2,9 
Xaut 
and lis aut T Y 5 + (1 7 Woia aut 
aut 


2,800 
— _ # s = £ 4. š 
0.2 (25) + 0.8(0.944) = 0.945 


The forecast for the winter quarter is then given by 
Fanci = Fwi = SaatUoia win) = 2,952(0.993) = 2,931. 


In order to obtain the forecast for the spring quarter, the smoothed level of the 
time series, Syin, must be obtained. The smoothed level of the seasonal factor for next 
year’s winter quarter forecast, Zew win» 1S also obtained. 


Swin = a (=) + (l z DS aut 


old win 


ll 


2,925 
0.1 (2225) + 0.9(2,952) = 2,951, 


0.993 
win 
and Tew win 7 (3 ) + a 7 Wota win 
2,925 
fone 4 mee K e = 5 3. 
0 222) + 0.8(0.993) = 0.99 


The forecast for the spring quarter is then given by 
Fyinei = Foor = SwinWota spr) = 2,951(1.025) = 3,025. 


Finally, in order to obtain the forecast for the summer quarter, the smoothed 
level of the time series, Seed must be obtained. The smoothed level of the seasonal 
factor for next year’s spring quarter forecast, I, is also obtained. 


new spr? 
Son = o( 2a : ) + (1 — @)Syin 
old spr 
3,040 
= 0.1) ——]} + 0.9(2,951) = 2,952, 
0 (3%) 0.9(2,951) = 2,9 
Xspr 
and Tiew spr 7 Y Sopr t= Woa Spr 





3,040 
0.2 (328) + 0.8(1.025) = 1.026. 


The forecast for the summer quarter is then given by 
Ssort+1 = Ssum = SspUoia sum) = 2,952(1.038) = 3,064. 


Although these forecasts appear to yield good results, a forecasting technique 
based upon a model of a generating process of the time series that is represented by 
a linear trend superimposed with seasonal effects and random fluctuations (a model 
not presented in this text) may be more appropriate (based upon the limited amount 
of past data presented). 


19.7 Forecasting Errors 


Several forecasting techniques have been presented together with different underlying 
models of the time series. How does one compare these techniques, especially if the 
generating process is unknown, as is frequently the case in practice? Some measure 
of performance is called for. 

Define the forecast error, E,, as the difference between the observed value of 
the time series for period r and the forecast for period £, that is, 


BS Xp Sok: 


The forecast error is also referred to as the residual. If the underlying model 
of the time series is appropriate for the forecasting technique chosen, the E, should 
behave, in a probabilistic sense, like the random error associated with the stochastic 
process, and the technique that produces ‘‘small’’ values of E, is desirable. A measure 
of ‘‘small’’ is the mean square error (MSE) associated with the forecasting technique. 
If there are n time periods, then 


Ett Ep es ee 
n 





MSE = 


For the bicycle example using winter and spring values together with the appropriate 
forecasts, the following are mean square errors for the three forecasting techniques 
discussed earlier: 


(2,925 — 2,800)? + (3,040 — 2,925)? 











Last value forecasting MSE = 5 = 
2,925 — 2,945)? + (3,040 — 3,042)? 

Linear trend forecasting MSE = ( ) a ( ) = 
2,925 — 2,931)? + (3,040 — 3,025)? 

Seasonal forecasting MSE = ( ) 5 ( ) = 


The best technique of the three, based upon the MSE measure of effectiveness, 
is the seasonal forecasting technique. Last value forecasting performs poorest. Of 
course, these results are based upon past data (and very few points at that), and there 
is no assurance that, say, the seasonal forecasting technique will perform well with 
future data, unless the underlying model is appropriate for that forecasting technique. 

Note that the simple bicycle example illustrates a useful concept. Before choos- 
ing any forecasting technique, it is worthwhile ‘‘trying it out’? on some past data. 
This will reveal whether or not a forecasting technique is reasonable for the past data 


14,425, 
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Probabilistic Models forecasting technique will perform in the future. 


19.8 The Box-Jenkins Method 


It has already been pointed out that procedures and models frequently are not coor- 
dinated in practice. The beauty of the Box-Jenkins method, a sophisticated and com- 
plex technique that requires a great amount of past data (a minimum of 50 time 
periods), is that the model and the procedure are coordinated. There is a systematic 
approach to identifying an appropriate model, chosen from a rich class of models. 
The historical data are used to test the validity of the model. The model also generates 
an appropriate forecasting procedure. 

The Box-Jenkins method is iterative in nature. First, a model is chosen. To 
choose this model, we must compute autocorrelations and partial autocorrelations and 
examine their patterns. An autocorrelation measures the correlation between time 
series values separated by a fixed number of periods. This fixed number of periods is 
called the lag. Therefore, the autocorrelation for a lag of two periods measures the 
correlation between every other observation; i.e., it is the correlation between the 
original time series and the same series moved forward two periods. The partial 
autocorrelation is a conditional autocorrelation between the original time series and 
the same series moved forward a fixed number of periods, holding the effect of the 
other lagged times fixed. We can compute both the autocorrelations and the partial 
autocorrelations for all lags; we can do it easily with a computer. From the autocor- 
relations and the partial autocorrelations, we can identify the form of one or more 
possible models because a rich class of models is characterized by these parameters. 
Actually, we compute the sample autocorrelations and the sample partial autocorre- 
lations, but these computations are “‘good’’ estimates, because we assume large 
amounts of data. Now that we have identified the model—i.e., we have identified the 
functional form—we must estimate the parameters associated with the model. This 
estimate is made using the historical data. The functional form then becomes known 
(or approximately known), and we can compute the residuals and examine their be- 
havior. Similarly, we can examine the behavior of the estimated parameters. If both 
the “‘residuals’’ and the ‘‘estimated parameters’’ behave as expected under the pre- 
sumed model, the model appears to be validated. If they do not, then the model should 
be modified and the procedure repeated until a model is validated. At this point, we 
can obtain a forecast. 

For example, suppose that the sample autocorrelations and the sample partial 
autocorrelations are calculated (usually by computer) as shown in Fig. 19.3. The 
sample autocorrelations appear to decrease exponentially as a function of the time 
lags, while the sample partial autocorrelations appear to have spikes at the first and 
second time lags of the observations followed by values that seem to be of negligible 
magnitude. This behavior of the sample autocorrelations and the sample partial au- 
tocorrelations is characteristic of the functional form, 


X, = By + BX,- + BX,- + e 


Assuming this functional form, we use the time series data to estimate By, B,, and 
B, (usually by means of a computer program). Denote these estimates by bp, b,, and 
b,, respectively. Using these estimates, together with the time series data, we obtain 
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Figure 19.3 Plot of sample autocorrelation and partial autocorrelation versus time lags. 


the residuals 
xX, — (bo + bX, + baxa). 


If the assumed functional form is adequate, the residuals and the estimated parameters 
should behave in a predictable manner. In particular, the sample residuals should 
behave as approximately independent, normally distributed random variables, each 
having mean zero and variance o° (assuming that e,, the random error at time period 
t, has mean zero and variance a’). The estimated parameters should be uncorrelated 
and significantly different from zero. Statistical tests are available for this diagnostic 
checking. 

The Box-Jenkins procedure appears to be a complex one, and it is. Fortunately, 
computer software is available for the procedure. The programs calculate the sample 
autocorrelations and the sample partial autocorrelations necessary for identifying the 
form of the model. They also estimate the parameters of the model and do the diag- 
nostic checking. These programs, however, cannot accurately identify one or more 
models that are compatible with the autocorrelations and the partial autocorrelations. 
This expertise can be acquired, but it is beyond the scope of this text. Although the 
Box-Jenkins method is complicated, the resultant forecasts are extremely accurate, 
and, when the time horizon is short, better than most other forecasting techniques. 
Further, the procedure produces a measure of the forecast error. 


19.9 Linear Regression 


Statistical problems often are concerned with data where there exists a relationship 
between two variables. This section highlights the results when the relationship is 
linear. For example, suppose that a publisher of textbooks is concerned about the 
initial press run for her books. She sells books both through bookstores and through 
mail orders. This latter method uses an extensive advertising campaign through pub- 
lishing media and direct mail. The advertising campaign is conducted prior to the 
publication of the book. The sales manager has noted that there is a rather interesting 
linear relationship between the number of mail orders and the number sold through 
bookstores during the first year. He suggests that this relationship be exploited to 
determine the initial press run for subsequent books. 

Thus, if the number of mail order sales for a book is denoted by X, and the 
number of bookstore sales by Y, then the random variables X and Y exhibit a degree 
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of association. There is no functional relationship between these two random vari- 
ables; i.e., given the number of mail order sales, one does not expect to determine 
exactly the number of bookstore sales. Note that for any given number of mail order 
sales, there is a range of possible bookstore sales, and vice versa. This variation may 
be partially due to measurement errors (incorrect counts, for example), but it can be 
attributed primarily to individual variation among published books. Thus no unique 
functional relationship between mail order sales and bookstore sales can be expected. 
However, it is anticipated that bookstore sales, for a given observed number of mail 
order sales, increase as mail order sales increase. If sales increase this way, what then 
is meant.by the statement, ‘“The sales manager has noted that there is a rather inter- 
esting linear relationship between the number of mail orders and the number sold 
through bookstores during the first year’’? Such a statement implies that the expected 
value of the number of bookstore sales is linear with respect to the number of mail 
order sales; i.e., 


E[Y|X = x] = A + Bx. 


Thus, if the number of mail order sales is x for many different books, the average 
number of corresponding bookstore sales would tend to be approximately A + Bx. 

Other examples of this degree of association model can easily be found. An 
educator may be interested in the relationship between a student’s performance on the 
college entrance examination and his subsequent performance in college. An engineer 
may be interested in the relationship between tensile strength and hardness of a ma- 
terial. An economist may wish to predict a measure of inflation as a function of the 
cost of living index, and so on. 

The degree of association model is not the only model of interest. In some cases, 
there exists a functional relationship between two variables that may be linked lin- 
early. In a forecasting context, one of the two variables is time, while the other is the 
variable of interest. In Sec. 19.5 such an example was mentioned in the context of 
the generating process of the time series being represented by a linear trend super- 
imposed with random fluctuations, i.e., 


xX, =A+t Bt t+ e,, 


where A is a constant, B is the slope, and e, is the random error, assumed to have 
expected value equal to zero and constant variance. (The symbol X, can also be read 
as X given t or as X|t.) It follows that 


EX) = A + Bt. 


Note that both the degree of association model and the exact functional rela- 
tionship model lead to the same linear regression, and their subsequent treatment is 
almost identical. Hence the publishing example will be explored further to illustrate 
how to treat both kinds of models, although the special structure of the model, 


EX) = A+ Bt, 


with ¢ taking on integer values starting with 1 leads to certain simplified expressions. 
In regression analysis, standard notation uses X to represent the independent variable 
and Y to represent the dependent variable of interest. Consequently, the notational 
expression for this special time series model now becomes 


Y, = A + Bt + e,. 
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Suppose that bookstore sales and mail order sales are given for 15 books. These data Forecasting 
appear in Table 19.2, and the resulting plot is given in Fig. 19.4. 
It is evident that the points in Fig. 19.4 do not lie on a straight line. Hence it 
is not clear where the line should be drawn to show the linear relationship. Suppose 
that an arbitrary line, given by the expression f = a + bx, is drawn through the 
data. A measure of how well this line fits the data can be obtained by computing the 
sum of squares of the vertical deviations of the actual points from the fitted line. Thus 
let y; represent the bookstore sales of the ith book and x; the corresponding mail order 
sales. Denote by ¥, the point on the fitted line corresponding to the mail order sales 
of x;. The proposed measure of fit is then given by 


Q = Y FP + Oo —- HP Hoo 
15 
+ Ors ~ as? = DB Ov — Hi? 
The usual method for identifying the ‘“best’’ fitted line is the method of least squares. 
This method chooses that line, a + bx, that makes Q a minimum. Thus a and b are 


obtained simply by setting the partial derivatives of Q with respect to a and b equal 
to zero, and solving the resultant equations. This method yields the solution, 


and a=yr 
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where x= 
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Table 19.2 Data for Mail Order 
and Bookstore Sales Example 


Mail Order Sales | Bookstore Sales 





1,310 4,360 
1,313 4,590 
1,320 4,520 
1,322 4,770 
1,338 4,760 
1,340 5,070 
1,347 5,230 
1,355 5,080 
1,360 5,550 
1,364 5,390 
1,373 5,670 
1,376 5,490 
1,384 5,810 
1,395 6,060 
1,400 5,940 
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Bookstore sales 


5,000 
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4,600 
4,500 
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Mail order sales 
Figure 19.4 Plot of mail order sales versus bookstore sales. 


= Ayi 
d = ais 
an y 2 = 
For the publishing example, 
x = 1,353.1, 
y = 5,219.3, 
15 
> @ — DO; — F) = 214,543.9 
i=l 
15 
X, @ — xX} = 11,966, 
i=] 
anda = —19,041.9, b = 17.930. Hence the least-squares estimate of the bookstore 


sales ¥, when the mail order sales is x, is given by 
¥ = —19,041.9 + 17.930x, 
and this line is drawn in Fig. 19.4. 


This fitted line is useful for forecasting purposes. For a given value of x, the 
corresponding value of y represents the forecast. However, the decision maker may 
be interested in some measure of uncertainty that is associated with this forecast. This 
measure is easily obtained provided that certain assumptions can be made. Therefore, 
for the remainder of this section, it is assumed that: 


1. A random sample of n pairs (x,, Y1), Q, Y,),.... Œn Y,) is to be taken. 
2. The Y, are normally distributed with mean, A + Bx,, and variance, o? 


L 


(independent of 7). 


The assumption that Y; is normally distributed is not a critical assumption in 
determining the uncertainty in the forecast, but the assumption of constant variance 
is crucial. Furthermore, an estate of this variance is required. 


An unbiased estimate of o° is given by Sis where 
On 30" 
Se = eo (n — 2) ° 


Confidence Interval Estimation of E(Y|x = x.) 


A very important reason for obtaining the linear relationship between two variables 
is to use the line for future decision making. From the regression line, it is possible 
to estimate E(Y|x) by a point estimate (the forecast) and a confidence interval estimate 
(a measure of forecast uncertainty). For example, the publisher might want to use this 
approach to estimate the expected number of bookstore sales corresponding to mail 
order sales of, say, 1,400, by both a point estimate and a confidence interval estimate. 
She may be interested in this approach for forecasting purposes. A point estimate 
corresponding to x = xx is given by 


J, = at bxs. 


The endpoints of a (100)(1 — a) percent confidence interval are given by 





a + bxs — tajzn-28yx 


and a + bxs + tajrn—2Sylx 





where s3, is the estimate of o7, and fejzn—2 is the 100a/2 percentage point of the t 
distribution with n — 2 degrees of freedom (see Table A5.2 of Appendix 5). It should 
be noted that the interval is narrowest where x. = x, and it becomes wider as x. 
departs from the mean. 

In the publishing example, s}), 
17,030. If a 95 percent confidence interval is required, Table A5.2 gives f 995.13 = 
2.160. The results derived in the preceding section yield 6,060 as the point estimate 
of E(¥|1,400), i.e., the forecast. Hence, the lower confidence limit corresponding 
to mail order sales of 1,400 is 5,918, and the upper confidence limit is 6,202. The 
fact that the confidence interval was obtained at a data point (x = 1,400) is purely 
coincidental. 


is computed from the data in Table 19.2 to be 
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Predictions 


The confidence interval statement for the expected number of bookstore sales corre- 
sponding to mail order sales of 1,400 may be useful for budgeting purposes, but it is 
not too useful for making decisions about the actual press run. Instead of obtaining 
bounds on the expected number of bookstore sales, this kind of decision requires 
bounds on what the actual bookstore sales will be, i.e., a prediction interval on the 
value that the random variable (bookstore sales) takes on. This measure is a different 
measure of forecast uncertainty. The two endpoints of such an interval are given by 
the expressions 








1 x, — ¥7% 
a + bxy — tajzn-23y]x 1+-+ ‘ + ) 


* Da- 
i=] 





and a+ DX, + byjan—2Sylx 





For a given value x, , the probability is 1 — æ that the value of the future Y, associated 
with the x, will fall in this interval. Thus, if x, is 1,400, then the corresponding 95 
percent prediction interval for the number of bookstore sales is given by 6,060 + 
316, which is naturally wider than the confidence interval for the expected number of 
bookstore sales. 

Whereas the publisher can find an interval that will contain bookstore sales 
corresponding to particular mail order sales with probability 1 — a, she is unable to 
use this type of result over and over and still maintain a measure for making correct 
statements. The reason is that these statements would all be based upon the same 
statistical data, so that the statements would not be statistically independent. If the 
statements are independent, and if k future bookstore sales are to be predicted, with 
each statement being made with probability 1 — a, then the probability is (1 — a) 
that all k predictions of future bookstore sales are correct. However, if k is large or 
possibly unknown, even this technique based upon the (incorrect) assumption of in- 
dependence would be useless. A solution to this problem can be obtained by using 
simultaneous tolerance intervals. Using this technique, the publisher can take the 
mail order sales of any book, find an interval (based on the previously determined 
fitted line) that will contain the actual bookstore sales with probability at least 1 — a, 
and repeat this for any number of books having the same or different mail order sales. 
Furthermore, the probability is P that all of these predictions are correct. An alternative 
interpretation is as follows. If every publisher followed this procedure, each using his 
or her own fitted line, then 100P percent of the publishers (on the average) would 
find that at least 100(1 — œ) percent of their bookstore sales would fall into the 
predicted intervals. This measure is the third measure of forecast uncertainty. The 
expression for the endpoints of each such interval is given by 


= gët 
a + bx, — c**sy, 





and a + bx} Oe 





where c** is given in Table 19.3; c** is clearly a function of n, P, and a. 

Thus the publisher can state that the bookstore sales corresponding to known 
mail order sales will fall in the interval constructed using the expressions just given. 
Such statements can be made for as many books as the publisher desires. Furthermore, 
the probability is P that at least 100(1 — a) percent of bookstore sales corresponding 
to mail order sales will fall in these intervals. If P is chosen as 0.90 and a = 0.05, 
the appropriate value of c** is 11.625. Hence the number of bookstore sales corre- 
sponding to mail order sales of 1,400 books will fall in the interval 6,060 + 759. If 
another book had mail order sales of 1,353, the bookstore sales would fall in the 
interval 5,258 + 390, and so on. At least 95 percent of the bookstore sales will fall 
into their predicted intervals, and these statements are made with confidence 0.90. 


Table 19.3 Values of c** 
n a = 0.50 aœ = 0.25 a = 0.10 a = 0.05 a= 0.01 a = 0.001 





P = 0.90 
4 7.471 10.160 13.069 14.953 18.663 23.003 
6 5.380 7.453 9.698 11.150 14.014 17.363 
8 5.037 7.082 9.292 10.722 13.543 16.837 
10 4.983 7.093 9.366 10.836 13.733 17.118 
12 5.023 7.221 9.586 11.112 14.121 17.634 
14 5.101 7.394 9.857 11.447 14.577 18.232 
16 5.197 7.586 10.150 11.803 15.057 18.856 
18 5.300 7.786 10.449 12.165 15.542 19.484 
20 5.408 7.987 10.747 12.526 16.023 20.104 

P = 0.95 
4 10.756 14.597 18.751 21.445 26.760 32.982 
6 6.652 9.166 11.899 13.669 17.167 21.266 
8 5.933 8.281 10.831 12.484 15.750 19.568 
10 5.728 8.080 10.632 12.286 15.553 19.369 
12 5.684 8.093 10.701 12.391 15.724 19.619 
14 5.711 8.194 10.880 12.617 16.045 20.050 
16 5.771 8.337 11.107 12.898 16.431 20.559 
18 5.848 8.499 11.357 13.204 16.845 21.097 
20 5.937 8.672 11.619 13.521 17.272 21.652 

P = 0.99 
4 24.466 33.019 42.398 48.620 60.500 74.642 
6 10.444 14.285 18.483 21.215 26.606 32.920 
8 8.290 11.453 14.918 17.166 21.652 26.860 
10 7.567 10.539 13.796 15.911 20.097 24.997 
12 7.258 10.182 13.383 15.479 19.579 24.403 
14 7.127 10.063 13.267 15.355 19.485 24.316 
16 7.079 10.055 13.306 15.410 19.582 24.467 
18 7.074 10.111 13.404 15.552 19.794 24.746 
20 7.108 10.198 13.566 15.745 20.065 25.122 





Source: Reprinted by permission from Lieberman, G. J., and R. G. Miller: ‘‘Si- 
multaneous Tolerance Intervals in Regression,” Biometrika, 50(1 and 2):164, 1963. 
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19.10 Conclusions 


It is important to consider the entire forecasting system carefully. The need to obtain 
a forecast has to be identified at the appropriate management level. The historical data 
required must be compiled. By studying these data, an appropriate model can be 
structured. A forecasting procedure that behaves well under the model should be 
selected. The forecasting procedure may require choosing one or more parameters — 
e.g., the smoothing constant «œ in exponential smoothing —and the historical data may 
prove useful in making this choice. Finally, the forecasting process should be viewed 
as dynamic, with the accumulated new data compared with their associated forecasts. 
These data may also be used to update the parameters and the form of the underlying 
model, as well as the parameters of the forecasting procedure itself. 


SELECTED. REFERENCES 


1. Box, G. E. P., and G. M. Jenkins: Time Series Analysis, Forecasting and Control, Holden- 
Day, San Francisco, 1976. 
2. Brown, R. G.: Statistical Forecasting for Inventory Control, McGraw-Hill, New York, 
1959. 
3. Brown, R. G.: Smoothing, Forecasting, and Prediction of Discrete Time Series, Prentice- 
Hall, Englewood Cliffs, N.J., 1972. 
4. Gardner, E. S., Jr.: ‘‘Exponential Smoothing: The State of the Art,” Journal of Fore- 
casting, 4:1-38, 1985. 
5. Gilchrist, W. G.: Statistical Forecasting, Wiley, New York, 1976. 
6. Hax, A. C., and D. Candea: Production and Inventory Management, Prentice-Hall, 
Englewood Cliffs, N.J., 1984. 
7. Hoff, J. C.: A Practical Guide to Box-Jenkins Forecasting, Lifetime Learning Publications, 
Belmont, Calif., 1983. 
8. Johnson, L. A., and D. C. Montgomery: *‘Operations Research in Production,” Planning, 
Scheduling, and Inventory Control, Wiley, New York, 1974. 
9. Levin, R. I., D. S. Rubin, and J. P. Stinson: Quantitative Approaches to Management, 
6th ed., McGraw-Hill, New York, 1986. 
10. Montgomery, D. C., and L. A. Johnson: Forecasting and Time Series Analysis, McGraw- 
Hill, New York, 1976. 


PROBLEMS 


1.* Suppose that the previous forecast was 2,083, the actual value of the variable of 
interest for the last period was 1,975, and the oldest value of the variable of interest was 1,945. 
Using the moving average technique based upon the most recent four observations, what is the 
new forecast for the next period? 


2. Suppose that the previous forecast was 2,083, the actual value of the variable of 
interest for the last period was 1,975, and a = 0.3. Using exponential smoothing, what is the 
new forecast for the next period? 


3.* Suppose that the previous forecast was 782, the actual value of the variable of 
interest for the last period was 794, and a = 0.1. Using exponential smoothing, what is the 
new forecast for the next period? 
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4. Suppose that the previous forecast was 782, the actual value of the variable of interest 
for the last period was 794, and the oldest value of the variable of interest was 805. Using the 
moving average technique based upon the most recent observations, what is the new forecast 
for the next period? 


5. You are the new person at a statistical forecasting service, and you have been asked 
to update a moving average forecast based upon the most recent 10 observations. You know 
that the previous forecast was 1,551, the actual value of the variable of interest for the latest 
period (i.¢., zero periods ago) is 1,532, and the oldest value (i.e., 10 periods ago) of the 
variable of interest was 1,632. What is the new forecast for the next period? 


6. If a is set equal to zero or 1 in the exponential smoothing expression, what happens 
to the forecast? 


7. Use the bicycle sales example presented in the section on exponential smoothing 
adjusted for trend, and solve the problem in this example using simple exponential smoothing 
with a = 0.3. 


8. A company uses exponential smoothing with œ = 4 to forecast demand for a product. 
For each month, the company keeps a record of the forecasted demand (made at the end of the 
previous month) and also the actual demand. Some of the records have been lost; the remaining 
data appear in the table below. 


January February March April May June 


400 380 390 380 
400 360 — n= 





Forecast 
Actual 





(a) Using only data in the table for March, April, May, and June, determine the actual 
demands in April and May. 

(b) Suppose now that a clerical error is discovered; the actual demand in January was 
432, not 400 as shown in the table. Using only the actual demands going back to 
January (even though the February actual demand is unknown), give the corrected 
forecast for June. 


9. Use the bicycle sales example presented in the section on exponential smoothing 
adjusted for trend, and solve the problem in this example using a = £ = 0.3. 


10.* The U.S. unemployment rate time series shown in Fig. 19.1 can be represented 
in tabular form as follows: 








Unemployment Unemployment Unemployment 
Rate (%) Rate (%) Rate (%) 


6.9 8.2 9.4 
6.7 7.0 9.2 
7.9 7.3 9.8 
7.1 7.5 9.9 






















Unemployment 
Rate (%) 


11.4 
10.0 

9.4 
8.4 






Unemployment 
Rate (%) 


8.8 
7.6 
7.5 
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(a) Starting with a forecast for 10/80, use the moving average technique based upon 
the past three periods to forecast each unemployment rate through 10/84. 
(b) Examine the residuals for 10/80 through 7/84, and compute the mean square error, 


i.e., 
> 


16 ` 





11. Use the unemployment data of Prob. 10. 
(a) Starting with a forecast for 10/80, use the exponential smoothing technique with an 
initial estimate of 7.2 percent and a = 0.1, i.e., 


Forecast for 10/80 = (0.1)7.9 + (0.9)7.2, 


to forecast each unemployment rate through 10/84. 
(b) Examine the residuals for 10/80 through 7/84, and compute the mean square error, 


1.€., 
5 E 


16 ` 





12. (a) Solve Prob. 11 using œ = 0.3. 
(b) Which a (@ = 0.1 or 0.3) would you use to forecast the unemployment rate 
for 10/84? 


13. Based upon the results of Probs. 10(b) and 11(b), which forecasting procedure would 
you choose, moving average or exponential smoothing? 


14. Use the unemployment data of Prob. 10. 

(a) Starting with a one-step forecast for 10/80, use the exponential smoothing technique 
adjusted for trend, with an initial estimate of 7.2 percent for unemployment and 
with an initial estimate of trend of 0.2 percent, to forecast each unemployment rate 
through 10/84. Use a = B = 0.1. 

(b) Examine the residuals for 10/80 through 7/84, and compute the mean square error, 


i.e., 
DE 
16 ` 


15. Based upon the results of Probs. 10(b), 11(b), and 14(b), which forecasting pro- 
cedure would you choose, moving average, exponential smoothing, or exponential smoothing 
adjusted for trend? 





16. In order to plan for a suitable labor force, you need to know the demand for a 
particular product. The following table presents demand over the past 11 quarters. 


[Demand || uaner | Demana || Oraner | Demana | 
; 
7 
8 





736 
a fe 724 
665 il 813 
630 12 — 


(a) Starting with a forecast for quarter 5, use the moving average technique based upon 
the past 4 quarters to forecast each demand through quarter 12. 
(b) Examine the residuals for quarter 5 through quarter 12, and compute the mean square 


error, i.e., 
2 
DE; 


7 





17. Use the quarterly demand data of Prob. 16. 
(a) Starting with a forecast for quarter 3, use the exponential smoothing technique with 
an initial estimate of 546 and a = 0.1, i.e., 


Forecast for quarter 3 = (0.1)528 + (0.9)546, 


to forecast each demand through quarter 12. 
(b) Examine the residuals for quarter 3 through quarter 11, and compute the mean square 
error, i.€., 


DE 


9 





18. (a) Solve Prob. 17 using a = 0.3. 
(b) Which æ (a = 0.1 or 0.3) would you use to forecast the demand for quarter 
12? 


19. Based upon the results of Probs. 16(b) and 17(b), which forecasting procedure would 
you choose, moving average or exponential smoothing? 


20. Use the quarterly demand data of Prob. 16. 

(a) Starting with a one-step forecast for quarter 3, use the exponential smoothing 
technique adjusted for trend, with an initial estimate of 546 for demand and an 
initial estimate of trend of 18, to forecast each demand through quarter 12. Use 
a= 8 =0.1. 

(b) Examine the residuals for quarter 3 through quarter 11, and compute the mean square 


error, i.€., 
DEF 
y 


21. Based upon the results of Probs. 16(b), 17(b), and 20(b), which forecasting pro- 
cedure would you choose, moving average, exponential smoothing, or exponential smoothing 
adjusted for trend? 





22. Use the unemployment data of Prob. 10. 

(a) Assume a seasonal effects model (four periods per year) and use the data for 1980 
to estimate the initial seasonal factors. Use exponential smoothing with seasonal 
effects to forecast the unemployment for each period of the year 1981. Use the final 
forecast (10/81) to forecast the unemployment rate for 10/84. Leta = y = 0.1. 

(b) Examine the residuals for the four periods of 1981, and compute the mean square 


error, i.e., 
DE? 


4 





23. Solve Prob. 22 using 1982 data to estimate the initial seasonal factors, and forecast 
the unemployment rate for each period of the year 1983. Use the final forecast (10/83) to 
forecast the unemployment rate for 10/84. 


24.* Suppose that a time series behaves as follows: 
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(a) Using the method of least squares, estimate the line-A + Bt. 

(b) From the line found in (a), forecast Y,,. 

(c) Starting with a forecast in period 3, use the exponential smoothing technique with 
an initial estimate of 430 and a = 0.1, i.e., 


Forecast for period 3 = (0.1)446 + (0.9)430, 


to forecast each demand through period 11. 

(d) Starting with a one-step forecast for period 3, use the exponential smoothing tech- 
nique adjusted for trend, with an initial estimate of 430 for the variable of interest 
and an initial estimate of trend of 18, to forecast each demand through period 11. 
Use a = B = 0.2. 


25. The following data relate road width x and accident frequency Y. Road width (in 
feet) was treated as the independent variable, and values of the random variable Y, in accidents 
per 108 vehicle miles, were observed. 





Number of Observations = 7 x y 
7 7 

E x 26 %2 
» x, = 354 > Y, = 481 aa. Ge 
; 3 4 B 
S x? = 19,956 S Y? = 35,451 50 81 
al S 62 54 
Š 6 51 
S GY, = 22,200 


= 14 40 


Assume that Y is normally distributed with mean A + Bx and constant variance for all x, and 
that the sample is random. Interpolate if necessary. 

(a) Fit a least-squares line to the data, and forecast the accident frequency when the 
road width is 55 feet. 

(b) Construct a 95 percent prediction interval for Y,., a future observation of Y, cor- 
responding tox, = 55 feet. 

(c) Suppose that two future observations on Y, both corresponding to x, = 55 feet, 
are to be made. Construct prediction intervals for both of these observations so that 
the probability is at least 95 percent that both future values of Y will fall into them 
simultaneously. Hint: If k predictions are to be made [such as given in part (d)], 
each with probability 1 — a, then the probability is at least 1 — ka that all k future 
observations will fall into their respective intervals. 

(d) Construct a simultaneous tolerance interval for the future value of Y corresponding 
tox, = 55 feet with P = 0.90 and 1 — a = 0.95. 


26. The following data are observations on a dependent random variable Y taken at 
various levels of an independent variable x. [It is to be assumed that E(Y,|x,;) = A + Bx; 
and Y, are independent normal random variables with mean zero and variance o7.] Suppose the 
data are as follows. 








(a) Estimate the linear relationship by the method of least squares, and forecast the 
value of Y when x = 10. 

(b) Find a 95 percent confidence interval for the expected value of Y at x. = 10. 

(c) Find a 95 percent prediction interval for a future observation to be taken at x, = 
10. 


(d) For x, = 10, P = 0.90, and (1 — a) = 0.95, find a simultaneous tolerance 
interval for the future value of Y,. Interpolate if necessary. 


27. Ifa particle is dropped at time t = 0, physical theory indicates that the relationship 
between r, the distance traveled, and £, the time elapsed, is r = gt* for some positive constants 
g and k. A transformation to linearity can be obtained by taking logarithms: 


log r = log g + klogt. 


Letting y = log r, A = log g, and x = log ż, this relation becomes y = A +: kx. Due to 
random error in measurement, however, it can only be stated that E(¥\x) = A + kx. Assume 
Y is normally distributed with mean A + kx and variance o°. 

A physicist who wishes to estimate k and g performs the following experiment: At time 
O the particle is dropped. At time ¢ the distance r is measured. He performs this experiment 
five times, obtaining the following data: 


y = logr x = logt 





=3.95 —2.0 
ae Hes be ~1.0 
0.08 0.0 
2.20 +1.0 
3.87 +2.0 





(a) Obtain least-squares estimates for k and log g, and forecast the distance traveled 
when logt = +3.0. 

(b) Starting with a forecast for log r when log t = 0, use the exponential smoothing 
technique with an initial estimate of log r = —3.95 and a = 0.1, i.e., 


Forecast of log r (when log t = 0) = (0.1)(—2.12) + (0.9)(—3.95), 


to forecast each log r for all integer log t through log £ = +3.0. 

(c) Repeat part (b) using the exponential smoothing technique adjusted for trend and a 
one-step forecast. Use an inital estimate of trend equal to the slope found in part 
(a). Let B = 0.1. 


The following data have been calculated: 
y = 0.016 
Xæ — X = 10 
XG; — HC, — F) = 19.96 
XO; — 7)? = 0.0858 


S, lx 


— = 0.053 
VG - x? 
1 

. fz = 0.076. 
Syl 5 


Note: All logarithms are to the base 10. 
28. Suppose that the relation between Y and x is given by 
EU | x) = Bx, 


where Y is assumed to be normally distributed with mean Bx and known variance o°. n inde- 
pendent pairs of observations are taken and are denoted by x,, y,3 X2, Y2) - - -3 Xm Yy- Find the 
least-squares estimate of B. 
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20.1 Introduction 


Section 15.2 introduced the concept of a dynamic system evolving over time. The 
behavior of such a system resulted in an analysis of a particular type of stochastic 
process. The ideas presented can be illustrated by considering the following mainte- 
nance-model example. A production process contains a machine that deteriorates rap- 
idly in both quality and output under heavy usage, so that it is inspected periodically, 
say, at the end of each day. Immediately after inspection, the condition of the machine 
is noted and classified into one of four possible states: 





Condition 











0 Good as new 

1 Operable—minor deterioration 

2 Operable—major deterioration 

3 Inoperable—output of unacceptable quality 
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Let X, denote the observed state of the machine after inspection at the end of the tth 
day. It is reasonable to assume that the state of the system evolves according to some 
probabilistic ‘‘laws of motion,’’ so that the sequence of states {X,} can be viewed as 
a stochastic process. Furthermore, it will be assumed that the stochastic process is a 
finite-state Markov chain (see Sec. 15.3), with known transition matrix given by 





State | 0 1 2 3 
0 0 & we t 
1 jo # $ 4 
2 |o 0 4° 4 
3 0 0 0 1 





From this transition matrix, it becomes evident that once the machine becomes in- 
operable (enters state 3), it remains inoperable. Therefore, the analysis of this sto- 
chastic process is probably uninteresting because state 3 is an absorbing state, and 
eventually the machine will enter this state and just remain there; i.e., after some time 
period, X, will always equal 3. Clearly, from a practical point of view, this model is 
intolerable because a machine that is inoperable cannot continue to remain in the 
production process and must be replaced (or repaired). This action of replacement 
alters the behavior of the system, so that the system now evolves over time according 
to the joint effect of the probabilistic laws of motion and the action of replacing an 
inoperable machine. Note that the action of replacing an inoperable machine can be 
thought of as defining a maintenance policy. 

When a machine becomes inoperable and is replaced, the replacement machine 
is as good as new; i.e., the machine is found to be in state O at the time of the regular 
inspection at the end of the next day. As a practical matter, the replacement process 
can be thought of as taking 1 day to complete so that production is lost for this period. 

The costs incurred while this system evolves contains several components. When 
the system is in state 0, 1, or 2, defective items may be produced during the next 
day, and the expected costs are given by 








Expected Cost Due to 
Producing Defective Items 


0 
$1,000 
$3,000 





If the machine is replaced, a replacement cost of $4,000 is incurred, together with a 
cost of lost production (lost profit) of $2,000. Hence the total cost incurred whenever 
the system is in state 3 is $6,000. 

The stochastic process resulting from the system with the aforementioned main- 
tenance policy, i.e., replacing an inoperable machine, is still a finite-state Markov 
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chain, but with the transition matrix now given by 


State (0 1 2 3 
0 |0 e dz 
1 o # 4 ł 
2 0 0 4 ¢ 
3 1 0 0 0 





It may be interesting to evaluate the cost of this maintenance policy. If the (long- 
run) expected average cost per day or the (long-run) actual average cost per day is an 
appropriate measure, the results appearing in Sec. 15.7 (under the subsections on 
expected average cost per unit time or expected average cost per unit time for complex 
cost functions) are appropriate. 


By noting that pp > 0 for all i and j, it is evident that every state is positive 


recurrent and belongs to one class. The steady-state equations can be written as 
To = T3, 
mT = $71 ot im, 
Tm = em + 3m, + im, 
Tz = Te To + im + im, 


L= m + m + m + 73. 


The simultaneous solution is 


fo 


Tig = 13> 
m = h, 
Ti >= $, 
m = f. 


Hence the long-run expected average cost per day is given by 


25,000 
Om + 1,0007, + 3,0007, + 6,0007, = >= = $1,923.08, 


and this cost represents the cost of this maintenance policy. 


20.2 Markovian Decision Models 


J 


The previous section introduced an example of a maintenance model for a machine 
and presented a maintenance policy; i.e., when a machine becomes inoperable, it is 
replaced; otherwise, the machine is left alone. In other words, a decision is made to 
take the action replace the machine when it is found to be in state 3, whereas a 
decision is made to take the action leave the machine as is when it is found to be in 
state 0, 1, or 2. Even when these two actions are the only permissible ones, there are 


still other policies that can be generated; e.g., when the machine becomes inoperable 
or is found to be operable but with major deterioration (machine is in state 2 or 3), 
replace it; otherwise, leave the machine as is. Note that this policy generates a different 
transition matrix, i.e., 








— 
N 
w 


State 





O O ao as 


To make the machine-maintenance example more realistic, suppose that a third 
action is permitted: overhaul. When a machine is overhauled, the machine is returned 
to state | (operable —minor deterioration) at the time of the regular inspection at the 
end of the next day. As a practical matter, the overhaul process, like replacement, 
can be thought of as requiring a day to complete, so that production is lost for this 
period. Furthermore, overhauling the machine costs $2,000 and will not be considered 
as a viable decision when the machine becomes inoperable. 

In viewing this dynamic system, it is evident that the system evolves over time 
according to the joint effect of the probabilistic laws of motion and the sequence of 
decisions made (actions taken). In particular, the machine is inspected at the end of 
each day and its state is recorded. A decision as to which action to take must be made, 
Lea 


Decision Action 





1 Do nothing 
Overhaul (return system to state 1) 
3 Replace (return system to state 0) 





For the general model it will be assumed that a system is observed at time t = 


0, 1,.. . , and classified into one of a finite number of states labeled 0, 1, ... , M. 
Let {X,, t = 0, 1,.. . .} denote the sequence of observed states. After each observa- 
tion, one of K (finite) possible decisions (actions), labeled 1, 2, . . . , K, is taken.! 


Let {A,, ¢ = 0, 1, . . .} denote the sequence of actual decisions made. 

A policy, denoted by R, is a rule for making decisions at each point in time. In 
principle, a policy could use all the previously observed information up to time ¢, that 
is, the entire history of the system consisting of X, X,,...,X, and 
Ay, Ay, Ao, --- , A,- However, for most problems encountered in practice, it is 
sufficient to confine consideration to those policies that depend upon only the observed 
state of the system at time t, X,, and the possible decisions available. Hence a policy 
R can be viewed as a rule that prescribes decision d;(R) when the system is in state 
i, i = 0,1,..., M. Thus R is completely characterized by the values 


{d (R), dR), . - . , dyR}. 


! In general, the number of possible decisions may depend upon the state of the system. Such a case is. 
considered in Sec. 20.7. 
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Note that this description assumes that whenever the system is in state i, the decision 
to be made is the same for all values of t. Policies possessing this property are called 
stationary policies. 

In the example, the interesting policies are 







Verbal Description dR) dR)  d,(R) d(R) 








R, Replace in state 3 1 1 1 3 

R, Replace in state 3, 1 1 2 3 
overhaul in state 2 

R, Replace in states 2, 3 1 1 3 3 


Replace in states 1, 2, 3 


Note that policy R, is the policy described in the previous section, and R, is the policy 
alluded to earlier in this section. Furthermore, recall that each policy results in a 
different transition matrix. 

It has been noted that a system evolves over time according to the joint effect 
of the probabilistic laws of motion and.the sequence of decisions made; its path is 
dependent upon its initial state, Xo. It is assumed that whenever the system is in state 
i and decision d(R) = k is made, the system moves to a new state j, with known 
transition probability p,(k), for alli, j = 0, 1,..., Mandk = 1,2,..., K. Thus, 
if a given policy R is followed, the resultant stochastic process is a Markov chain 
with a known transition matrix (dependent upon the policy chosen). Unless otherwise 
noted, throughout this chapter it is assumed for technical reasons that the Markov 
chain associated with every transition matrix is irreducible. 

In the example, the following transition matrices are obtained: 




















R, R, 
State |} 0 1 2 3 State | 0 1 2 3 
0 |0 3 we Zw 0 jo & we > 
1 10 $ f ¢ 1 lo 8 4 ¢ 
2 |o 0o 4 4 2 0 1 0 0 
3 |1 0 0 0 3 |1 0 0 0 
Ra 
State | 0 1 2 3 
0o lo f w# $ 
1 |1 0 0 0 
2 |1 0 0 0 
3 1 0 0 0 





To summarize, given a distribution P{Xo = i} over the initial states of the system 
and a policy R, a system evolves over time according to the joint effect of the prob- 


abilistic laws of motion and the sequence of decisions made (actions taken). In par- 
ticular, when the system is in state i and decision dR) = k is made, then the 
probability that the system is in state j at the next observed time period is given by 
p(k). This situation results in a sequence of observed states Xp, X;, . . . and a se- 
quence of decisions made, Ap, A,, .... This sequence of observed states and se- 
quence of decisions made is called a Markovian decision process. The term Mar- 
kovian is used because of the underlying assumptions made about the probabilistic 
laws of motion. 

Four maintenance policies have been described, but their properties have not 
been evaluated. Questions such as ‘‘Which one is ‘best’?’’ remain to be answered. 
To pursue this avenue, it is necessary to introduce a cost structure. When the system 
is in state i and decision d(R) = k is made following policy R, a known cost C, is 
incurred. This cost may represent an expected rather than an actual cost. For example, 
in the maintenance problem, the cost of leaving a machine as is depends upon the 
random variable, the number of defective items produced during the next time period. 
The expected value of this cost function taken with respect to the distribution of the 
number of defective items will result in the desired cost C,,.1 It is important to reiterate 
that this cost depends upon only the state the system is found in and the decision 
made; that is, 


Ci = known (expected) cost incurred during next transition 
if system is in state i and decision k is made. 


For the four maintenance policies, the costs can be obtained from the following 
information: 




















Expected Cost Due Cost (Lost Total 
to Producing Maintenance Profit) of Cost 
Decision Defective Hers Cost Lost Production Per Day 
1. Leave 0 0 0 0 0 
machine 1 $1,000 0 0 $1,000 
as is 2 $3,000 0 0 $3,000 
3 co 0 0 æ 
2. Overhaul 0,1,2 0 $2,000 $2,000 $4.000 
3 oor $2,000 20 
. Replace $6,000 











* Because leaving the machine in an inoperable condition or overhauling it when it is inoperable is 
prohibited by assumption, a cost of infinity is assigned. An alternative approach would be to omit these 
decisions from the set of possible decisions when the machine is found to be in state 3. 


Note that the costs incurred when the decision is made to replace the machine 
are independent of the state of the system. This fact is evident because no production 
takes place during the ensuing day when this action is taken. Finally, the total expected 


! See Sec. 15.7 under the subsection on expected average cost per unit time for complex cost functions 
for an additional example. 
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costs incurred per day are summarized as follows: 


Cx (In Thousands of Dollars) 
1 2 








To compare policies, we must settle on an appropriate cost measure. One such 
measure associated with a policy is the (long-run) expected average cost per unit time; 
this measure will be the one used.! The results appearing in Sec. 15.7 are appropriate; 
i.e., for any policy, the (long-run) expected average cost per unit time, E(C), can be 
calculated from the expression 


M 
E(C) = > CikTi 
i=0 


where k = dR) for each i, and (7, Ti, - - . , My) represents the steady-state dis- 
tribution of the state of the system under the policy R being evaluated. Thus the policy 
that minimizes E(C) is sought. Using this criterion, it is evident that the distribution 
over the initial states of the system is not important, because the long-run effect of 
the cost of the initial decision is negligible. In the maintenance example, it is necessary 
to solve for (To, Tı, --., Ty) under each of the four policies of interest and then 
use these results to obtain E(C). The necessary calculations for R, are given in Sec. 
20.1; all are now summarized: 
















Policy Mo, Mis Ta, T3 E(C) 
R, G i, as, BD) H2O + 711) + 2G) + 26] = 3 = 1.923 
R, A, 2, A FD 37[2(0) + 15(1) + 2(4) + 2(6)] = 32 = 1.667 min. 
R, 1, 4, ty, AD H[2) + 71) + 1(6) + 16] = 1 = 1.727 





Giz, 32, 32) g[16(0) + 14(6) + 1(6) + 1(6)] = 33 = 3 





It is evident that policy R, is the best. Among the four policies considered, the policy 
that calls for replacing the machine when it is found to be in state 3 and overhauling 
it when it is found to be in state 2 is the best, and the (long-run) expected average 
cost per day is $1,667. 

The technique described here is just an exhaustive enumeration of a given set 
of possible policies. It is evident that direct enumeration becomes cumbersome when 
the number of policies is large and algorithms are desirable for finding optimal poli- 
cies. The next three sections consider such algorithms. 


1 The ensuing results are also valid for the (long-run) actual average cost per unit time measure as noted 
in Sec. 15.7. 


20.3 Linear Programming and Optimal Policies 


Section 20.2 defined a policy, and we saw that a policy R can be viewed as a rule 
that prescribes decision d;(R) when the system is in state i. Thus R is characterized 
by the values 


{d)(R), d,(R), Dee E dy(R)}. 


Alternatively, R can be characterized by assigning values D, = O or 1 in the matrix 


Decision, k 
1 2 ste K 
0| Do Do >te Dox 
1) Dy Doz >e Dix 
State . . 3 
M| Dm Dm ''’ Duk 


where each row must contain a single 1 with the rest of the elements zero (i.e., each 
row sums to 1). When an element D, = 1, it can be interpreted as calling for decision 
k when the system is in state i. In the maintenance-model example, policy R, can be 
characterized by the matrix 


Decision, k 
12 3 
Oļi 0 
State al (esa : 
2/0 1 0 
3]0 0 1 


i.e., replace the machine when it is in state 3, overhaul the machine when it is in 
state 2, and leave the machine as is when it is in state 0 or 1. This interpretation of 
the D,, provides motivation for a linear programming formulation. It is hoped that the 
expected cost of a policy can be expressed as a linear function of the D; or a related 
variable, subject to linear constraints. Unfortunately, the D,, are integers (zero or 1), 
and continuous variables are required for a linear programming formulation. This 
requirement can be handled by expanding the interpretation of a policy. The previous 
definition calls for making the same decision every time the system is in state i. The 
new interpretation of a policy will call for determining a probability distribution for 
the decision to be made when the system is in state i. Thus D, can now be viewed 
as! 


D, = P {decision = &lstate = i}, k= 1,2,...,K, 
i=0,1,..., M. 


Such a policy is called a randomized policy, whereas the policy calling for 
D = 0 or 1 can be called a deterministic policy. Randomized policies can again be 


' The right-hand side of this equation is read as the conditional probability that the decision k is made, 
given the system is in state 7. 
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Probabilistic Models Decision, k 
1 2 c+: K 
0j Da Do >te Dox 
1| Da Dy >e Dik 
State ; 3 
M Dm Dum + °° Dux 
where each row sums to 1, and now 
0=D, = 1. 
Note that each row (D,,, Di, . . . , Diẹ) is the probability distribution for the decision 


to be made when the system is in state i. As an example, suppose that a new policy, 
R,, is to be used in the maintenance model. This policy is a randomized policy and 
is given by the matrix 


Decision, k 
12 3 
0/1 0 0 
Gan Le eS 
De ł 4 
3lo 24 


This policy calls for observing the state of the machine at the end of the day. If it is 
found to be in state O or 1, it is left as is. If it is found to be in state 2, it is left as 
is with probability 4, overhauled with probability ¢, and replaced with probability 3. 
Presumably, a random device with these probabilities (possibly a table of random 
numbers) can be used to make the actual decision. Finally, if the machine is found 
to be in state 3, it is overhauled with probability 4 and replaced with probability 3. 

The linear programming formulation is best expressed in terms of a variable 
Yiz Which is related to D; as follows. Let y, be the steady-state unconditional prob- 
ability that the system is in state i and decision k is made; that is, 


Yg = P {state = i and decision = k}. 


From the rules of conditional probability, 


Yr = TD. 
K 
Furthermore, T; = 5 Vike 
k=1 
so that Dy = 2 = amo : 
Ti k= Wik 


There exist several constraints on y,,: 


1. IKor; = 1, so that 5X, Ei yg = 1. 


2. From results on steady-state probabilities (see Sec. 15.7),’ 


M K 
= DMP yp so that 5 Yik 
i=0 k= 


M K 
= > > VieP ilk), forj =0,1,...,M. 
i=0 k=1 
3. Ye =O,i = 0,1,..., Mandk = | ee rrr K: 


The long-run expected average cost per unit time is given by 
M K 
ERC) = 5 > TiC Dik = ` > Cin Vix: 
i=0 k= i=0 k=1 
Hence the problem is to choose the y; that 


M K 
Minimizes > >c Ci Vins 


subject to the constraints 


M K 
(1) > =l. 
i=0 k=1 
M K 
D Xy- È Dap =0, forj=0,1,...,M. 
k=1 i=0 k=1 
(3) y= 0, i=0,1,..., M; k=1,2,...,K. 


This formulation is clearly a linear programming problem that can be solved 
by the simplex method. Once the y are obtained, D,, is easily found from 


Jk 
Pioi Ya 

The solution has some interesting properties. It will contain (M + 1) basic 
variables y, = 0 (there is one redundant constraint). It can be shown that y > 0 for 
at least one k = 1,2,..., K, for each i = 0, 1,..., M. Therefore, it follows 
that y; > 0 for only one k for each i = 0, 1, . . . , M; that is, D, = O or 1. In other 
words, the optimal policy is deterministic rather than randomized. Finally, since there 
are (M + 2) functional constraints and K(M + 1) original variables, ‘‘practical’’ 
problems tend to be large under this formulation, so that solutions may not be ob- 
tainable even with the simplex method. 


Dy = 


EXAMPLE: We can formulate the machine-maintenance problem as a linear pro- 
gram; i.e.: 
Minimize 4,000yo2 + 6,000yYo3 + 1,000y,, + 4,000y,, + 6,000y,, 
+ 3,000y,, + 4,000y., + 6,000y.; + Miy3; + Moya. 
+ 6,000y33, 


1 The k is introduced in p;kk) to indicate that the appropriate transition probability depends upon the 
decision k. 
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where M, and M, are taken to be large numbers, subject to 


3 3 
y Dore 


i=0 k=] 


3 
È Yor ~ (Yos + Yiz + Yz + Yaa) == 0; 


Me 


Vix — Gor + Yoo + Yu + Yio + Yoo + Yao) = 0, 


> 
1 


3 
2 Yor — (yoi + yu, + 22) = 0, 


Me 





3k (ZeYor T ai + 21 + y3;) = 0, 


k=1 


and Yiz Z O, i = 0,1,2,3 and k= 1,2,3. 


This linear program can be solved by using the simplex method. 

The results yield all y equal to zero, except for yp) = Hi. Yn = Ž, Yn = Fy 
and y3; = #1. Note that these values are just the steady-state probabilities for policy 
R,, which is now seen to be the optimal policy. The corresponding 

a Sik 
j Lia Vie 


are given by Dy, = Dy, = Dy» = Dy = I, 





and all the remaining D, = 0. This policy calls for leaving the machine as is when 
it is in state O or 1, overhauling it when it is in state 2, and replacing it when it is in 
state 3. 


20.4 Policy Improvement Algorithms 
for Finding Optimal Policies 


A second algorithm for finding optimal policies is given by a policy improvement 
technique. The algorithm to be presented is useful in that it often leads to finding the 
optimal policy quickly, and also it is applicable under more general conditions than 
previously specified; e.g., under certain assumptions, the number of states may be 
countably infinite rather than finite. 

Following the model of Sec. 20.2, and as a joint result of the current state i of 
the system and the decision d(R) = k when operating under policy R, two things 
occur. A (expected) cost C that depends upon only the observed state of the system 
and the decision made is incurred. The system moves to a new state j at the next 
observed time period, with transition probability given by p,(k). If, in fact, a cost 
that depends upon both the initial and transited states is incurred, it is treated as 
follows. Denote by q;(k) the (expected) cost incurred when the system is in state i 
and decision k is made, and then evolves to state j at the next observed time period. 
Then 


M 
C= > qiOP (Ah). 


When a system operates as just described under policy R, it can be shown that 
there exist values g(R), Uo(R), U,(R), . . . , Uy(R) that satisfy 


M 
gR) + vR) = Cy t+ >) pfkufR), fori =0,1,2,...,M. 
j=0 


A heuristic justification for these relationships and an interpretation for these values 
are desirable. Denote by v}(R) the total expected cost of a system starting in state i 
(at the first observed time period) and evolving for n time periods. Then v7}(R) consists 
of two components, namely, (1) Cig, the cost incurred at the first observed time period 
as a result of the current state i and the decision dR) = k when operating under 
policy R, and (2) 225 P(T R), the total expected cost of the system evolving 
over the remaining n — 1 time periods. Thus the recursive equation, 


M 
VAR) = Cy + = p,(kyu? \R), 


fori = 0,1,2,..., Mand v\(R) = C, for all i is obtained. It is of interest to 
explore the behavior of the total expected cost v7(R) as n gets large. Now, it is known 
that the long-run expected average cost per unit time following any policy R can be 
expressed as 
M 
a(R) = 2 TiC ito 

which is independent of the starting state i. Hence v}(R) behaves approximately as 
ng(R) for large n, and, in fact, can be expressed (neglecting certain fluctuations) as 
the sum of two components, one of which is independent of the initial state and one 
of which is dependent upon it; that is, 


ui(R) ~ ng(R) + u,(R), 


where v;(R) can be interpreted as the effect on the total expected cost due to starting 
in state i. Thus 


Vi(R) — ViR) ~ v{R) — uR), 


so that v;(R) — vR) is a measure of the effect of starting in state i rather than 
state j. 

Substituting this linear approximation for v/(R) (assumed to be valid for large 
n) into the recursive equation for v7(R) leads to 


M 
a(R) + VAR) = Ca + 2 py jR), 
= 


fori = 0, 1,... , M, so that these values satisfy the expressed equations. 

Note that there are M + 1 equations with M + 2 unknowns, so that one of 
these variables may be chosen arbitrarily. By convention, v,,(R) will be chosen equal 
to zero. Therefore, by solving a system of linear equations, the (long-run) expected 
average cost per unit time following policy R, g(R) can be obtained. In principle, all 
policies can be enumerated, and that policy which minimizes g(R) can be found. 
However, even for a moderate number of states and decisions, this technique is 
cumbersome. Fortunately, there exists an algorithm that can be used to evaluate pol- 
icies and find the optimum one without complete enumeration. The algorithm begins 
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by choosing an arbitrary policy R; and calculates the values. of g(R,), Uo(R)), viR D, 

. » Uy- (R) [recall that v,(R,) is chosen equal to zero]. This step is called value 
determination. A better policy, denoted by R,, is then constructed. This step is called 
policy improvement. Using the new policy R,, we repeat the value-determination step. 
These steps continue until two successive iterations lead to identical policies, which 
signifies that the optimal policy has been obtained. In particular, the following steps 
are to be followed: 


Step 1: Value determination For an arbitrarily chosen policy R,, use p(k), Cx, 
and vy(R,) = 0 to solve the set of (M + 1) equations, 


M 
g(Ry) = Cy, + 2 p,(k,vf(R,) — v(R), i= 0,1,...,M, 
j= 
for all (M + 1) unknown values of g(R,), VaR), ViR) <, Uy- RD- 


Step 2: Policy improvement Using the current values of vR) computed for policy 
R,, find the alternative policy R, such that, for each state i, d(R,) = k, is the decision 
that makes 


M 
Cy, + > puku R) — vR) 
A 


a minimum; i.e., for each state i, find the appropriate value of k, that 


M 
Minimizes fe + > piku R) — vie}. 
ko=1,2,....K j=0 

and then set d,(R,) equal to the minimizing value of k,. This procedure defines a new 
policy, Rp. 

If R, does not equal R,, then return to step 1, using R, instead of R,, and solve 
for g(R2), Up(R2), U\(R2), . . . , Un-1(R2). Using these values, go to step 2 and find 
R,. Continue in this fashion until you find two successive R’s to be equal. When you 
find them, the optimal policy is achieved, and the algorithm terminates. In fact, it can 
be shown that 


1. 9(Rj4) = (R), forj = 1,2,...5 and 
2. The algorithm terminates with the optimal solution in a finite number of 
iterations. 


EXAMPLE: We shall solve the maintenance model presented in Sec. 20.2 by the 
policy improvement algorithm. Recall that the machine can be in one of four states: 
state 0, signifying the machine is as good as new; state 1, signifying the machine is 
operable with minor deterioration; state 2, signifying the machine is operable with 
major deterioration; and state 3, signifying the machine. is inoperable. There exist 
three possible decisions: decision 1 implies leaving the machine as is; decision 2 
implies overhaul, which returns it to state 1; and decision 3 implies replacement, 
which returns it to state 0. Each decision necessitates an action that affects the tran- 
sition matrix, and there are costs C associated with making decision k when the 
system is in state i. We want to find the optimal policy, and step 1 of the algorithm 


calls for choosing a policy arbitrarily. Choose the policy that calls for replacement of 
the machine when it is found to be in state 3; otherwise, leave the machine as is. 
Denote this policy by R,. The transition matrix for this policy is given by 





State | 0 1 2 3 
0 0 ¢ w w 
i jo # 4 4 
2 0 0 ¢ 4 
3 1 0 0 0 





The costs incurred following policy R, are given by 





With this policy, the value-determination step requires solving the following four 
equations simultaneously for g(R,), Up(R,), vi(Rı), and v(R,) [recall that v3(R,) is 
arbitrarily taken to be zero]: 


3 
B(Ry) = Con + È PoskiU(R) — VR) 
3 
= Cy, + 2 pylkv (R) — viR) 
3 
= Coy + È palkiu(R) = vR) 
= 


3 
= Czy + È> pykVR) — v3), 
j= 


or alternatively [with v3(R,) = 0], 


a(R) = + $u (R) + Tev (R,) ~ VR) 
= 3,000 + $u,(R,) — VR) 


= 6,000 + v,(R,). 
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The simultaneous solution to this system of equations yields 











2(R,) = ae = 1,923 
vR) = — ae = —4,077 
v(R,) = — mea = —2,615 
v(R,) = am = 2,154. 


Step 2 can now be applied. It is necessary to find the improved policy R,, which 
has the property that d(R,) = K, d,(R.) = ki, d,(R,) = kê, and d,(R.) = k 
minimize the following expressions: 


(0) Cog — Pool k8)4,077 — Poi(k?)2,615 + pox(k8)2,154 + 4,077 
(1) Ciy — Pio(k)4,077 — pi(kt)2,615 + pı(k})2,154 + 2,615 
(2) Cag — Pro G)4,077 — pa(k)2,615 + pa(k)2,154 — 2,154 
(3) Cag — Pao(kå)4,077 — pai(K3)2,615 + pa(k)2,154. 


To find k®, the ‘‘best’’ decision when the machine is in state 0, it is necessary 
to evaluate the first expression for all possible decisions: Note that the appropriate 
transition probabilities and the costs Cp, depend upon the decisions made. A summary 
of the necessary calculations follows: 


State 0 






Value of 
Expression 0 








Decision 


Pool) Poi Ro) 





It is clear that d(R,) = k = 1 minimizes this first expression, so that under R, the 
appropriate decision when the system. is in state 0 is to leave the machine as is. 

Similar calculations are required to find d,(R,) = k3, d,(R,) = k3, and d,(R) 
= k3; these calculations are summarized as follows: 


State 1 


Value of 
Expression 1 

















Decision | pok) | puk) 


Pulk) | Po) Cir, 
















1,000 
4,000 
6,000 


1,923 
4,000 
4,538 















State 2 
Value of 
Decision | palk) | Pak) | Pak) | Palk) Expression 2 
1 1,923 
2 —769 
3 —231 
State 3 
Value of 















Decision | Pso(kz) | p3ilko) | Ps2(ko) | pas(ko) Expression 3 











Thus d\(R,) = k} = 1, d(R,) = k4 = 2, and d,(R,) = k3 = 3. Hence policy R, 
calls for leaving the machine alone when it is in state 0 or 1, overhauling it when it 
is in state 2, and replacing it when it is in state 3. Furthermore, because R, differs 
from R}, at least one more iteration is required. The equations that must now be solved 
are given by [again setting v,(R.) = 0] 


gR) = + §U,(Rz) + Fevz(Ry) — VR) 
= 1,000 + łu (Ra) + $v,(Ry) — v (R) 
= 4,000 + v,(R) — VR) 


6,000 + u(R3). 
The simultaneous solution to these equations yields 


5,000 _ 





(Ro) = 1,667 
13,000 

Uo(R2) = 7 = 74,333 

v,(R>) = — 3,000 
2,000 

v{R)= —Z- = —667. 


Step 2 can now be applied. We seek an improved policy R, that has the property 
that d,(R;) = k$, d,(R3) = ki, d,(R;) = K3, and d,(R;) = K minimizes the following 
expressions: 


(0) Corg — Poolk3 4,333 — por(k$)3,000 — pop(k$)667 + 4,333 
(1) Cin — Piolk3)4,333 — p4i(k5)3,000 — p,2(k3)667 + 3,000 
(2) Cag — Poo(k3)4,333 — poy(k5)3,000 — pyo(k3)667 + 667 


(3) Crug — Psolk3 4,333 — psi(k3)3,000 — po(k3)667. 
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The first iteration provides most of the necessary data (the transition probabilities 
and C) required for determining the new policy, except for the values of each of the 
four expressions. These values are found to be 





Value of Value of Value of Value of 
Decision | Expression0 | Expression 1 | Expression 2 | Expression 3 
1 1,667 1,667 3,333 oo 
2 5,333 4,000 1,667 90 
3 6,000 4,667 2,334 1,667 





Thus 4,(R;) = k} = 1, d,(R,) = k} = 1, d,(R;) = kå = 2, and d,(R,) = k3 = 3, 
so this policy is identical to R,. Because the policies on two successive iterations are 
the same, the optimal policy has been obtained. This optimal policy calls for leaving 
the machine as is when it is in state 0 or 1, overhauling it when it is in state 2, and 
replacing it when it is in state 3. Of course, this result is the same one found in 
Sec. 20.3. 


20.5 Criterion of Discounted Costs 


Throughout this chapter, we measured policies on the basis of the (long-run) expected 
average cost per unit time or the (long-run) actual average cost per unit time. An 
alternative measure is to find the expected long-run total discounted cost. A discount 
factor a < 1 is specified, so that the present value of 1 unit of cost m periods in the 
future is œ”. a can be interpreted as equal to 1/(1 + i), where i is the current interest 
rate. This measure was used extensively in Chap. 18. We are seeking a policy that 
minimizes the expected long-run total discounted cost. 


Policy Improvement Algorithm 


The description of the Markovian decision process is as described previously. Given 
a distribution P{X, = i} over the initial states of the system and a policy R, a system 
evolves over time according to the joint effect of the probabilistic laws of motion and 
the sequence of decisions made (actions taken). In particular, when the system is in 
state i and decision dR) = k is made, then the probability that the system is in state 
j at the next observed time period is given by p;(k). Furthermore, a known expected 
cost C,, is incurred. Denote by V7(R) the expected total discounted cost of a system 
starting in state i (at the first observed time period) and evolving for n time periods. 
Then V?(R) consists of two components, namely, (1) Ciy, the cost incurred at the first 
observed time period as a result of the current state i and the decision d(R) = k when 
operating under policy R; (2) œ 2;i9p(k)V}~ 1(R), the expected total discounted cost 
of the system evolving over the remaining n — 1 time periods. Thus the recursive 
equation, 


M 
VAR) = Cy + & > p,(OVI"R), 
j=0 


fori = 0,1,2,...,M and Vi(R) = Cy, for all i is obtained. This policy can be 
evaluated by using the techniques associated with dynamic programming. It can be 
shown that as n approaches infinity, this expression converges to 


M 
V(R) = Cy + œ >) pyGOV(R), fori = 0, 1,..., M, 
j=0 i 


where V,(R) can now be interpreted as the expected long-run total discounted cost for 
a system starting in state i and continuing indefinitely. There are M + 1 equations 
and M + 1 unknowns, and hence V,(R) may be obtained by standard methods. For 
example, the machine-maintenance model will be solved using the policy that calls 
for leaving the machine as is when in state 0 or 1, overhauling it when it is in state 
2, and replacing it when it is in state 3—that is, policy R,. The discount factor will 
be chosen to be a = 0.9. The following set of equations is obtained: 


WR) = + 0.9[ SVI(R) + VR) + Te V3(R)] 
VR) = 1,000 + 0.9[ VR) + 4V,(R) + 4VA(R)] 
VR) = 4,000 + 0.9[ V,(R) ] 
V,(R) = 6,000 + 0.9[Vo(R) ]. 


The simultaneous solution to this system of equations yields 
V(R) = 14,949 
V,(R) = 16,262 
VR) = 18,636 
V3(R) = 19,454. 


Thus, assuming that the system started in state 0, the expected long-run total dis- 
counted cost is $14,949. 

The aforementioned procedure not only evaluates a given policy, but also is 
suggestive of an algorithm to determine the optimal policy. The calculations are similar 
to those required in the value-determination step (step 1) of the policy improvement 
technique presented in Sec. 20.4. Indeed, an algorithm very similar to that presented 
in Sec. 20.4 is available. In particular, these steps are to be followed: 


Step 1: Value determination For an arbitrarily chosen policy R,, use p,(k,) and 
Cix, to solve the set of (M + 1) equations 


M 
VR) = Cy, + a >, Py EDVARD, i=0,1,...,M, 
= 


for all (M + 1) unknown values of VR). 


Step 2: Policy improvement Using the current values of V(R,), find the alternative 
policy R, such that, for each state i, d;(R}) = k, is the decision that makes 


M 
Cu, + a È pylla)VARi) 
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a minimum; i.e., for each state i, find the appropriate value of k, that 


M 
Minimizes la +a> pivko}: 
> fer 


key =1,2,....K 


and then set d(R,) = minimizing the value of k,. This procedure defines a new policy 
R,. 

If R, does not equal R,, then return to step 1 by using R, instead of R,, and 
solve for V{R,), i = 0, 1,..., M. Using these values, go to step 2 and find R3. 
Continue in this fashion until you find two successive R’s to be equal. When you find 
them, the optimal policy is achieved, and the algorithm terminates. In fact, it can be 
shown that: 


1. VR) = VAR), fori = 0,1,...,Mandj = 1,2,...; 

2. The algorithm terminates with the optimal solution in a finite number of 
iterations; and 

3. The algorithm is valid without the assumption that the Markov chain asso- 
ciated with every transition matrix is irreducible. 


EXAMPLE: We shall obtain the optimal policy for the machine-maintenance problem 
by the policy improvement algorithm. The discount factor is chosen to be 0.9. The 
first step, the value-determination step, has already been carried out earlier in this 
section if the arbitrary policy chosen, R,, calls for leaving the machine as is when it 
is in state 0 or 1, overhauling it when it is in state 2, and replacing it when it is in 
state 3. The appropriate V’s are 


VR) = 14,949 
V\(R,) = 16,262 
V(R,) = 18,636 
VRD = 19,454. 

Step 2 can now be applied. We are seeking an improved policy R, that has the 
property that do(R>) = k$, d,(R,) = k}, do(R.) = k3, and d,(R,) = k} minimizes the 
following expressions: 

(0) Corg + 0.9[ poo(k3)14,949 + poi(k8)16,262 + po(k3)18,636 + pox(k3)19,454] 
(D Cig + 0.9[p1o(k3)14,949 + py (k4)16,262 + p(k3)18,636 + p,x(k4)19,454] 
2) Cug + 0.9[ poo(k3) 14,949 + po(k3)16,262 + pa(k$)18,636 + pr(k3)19,454] 
(3) C + 0.9[ p3o(k3)14,949 + p3(k3)16,262 + p5(k3)18,636 + p,(k3)19,454]. 
Most of the necessary data can be taken from the first iteration of the example 


in Sec. 20.4 (the transition probabilities and C,,). Using these data, the values of each 
of the four expressions are obtained as follows: 














Value of Value of 












Value of Value of 

















Decision | ExpressionO | Expression 1 | Expression 2 | Expression 3 
1 14,949 16,262 20,140 90 
2 18,636 18,636 18,636 20 


19,454 19,454 19,454 


Thus d (R) = kK? = 1,0 (Ry = k} = 1, ER) = B= 2, and d(R) = kå = 3, 
so that this policy is identical to R,. Because the policies obtained on two successive 
iterations are the same, the optimal policy has been obtained. Again, the optimal 
policy calls for leaving the machine as is when it is in state 0 or 1, overhauling it 
when it is in state 2, and replacing it when it is in state 3—the same policy that was 
obtained by using the long-run expected average cost per day criterion. 


Linear Programming Formulation 


Just as there is a policy improvement algorithm for the expected long-run total dis- 
counted cost criterion, there is also a linear programming formulation. It can be shown 
that the statement of the linear programming problem is to choose the y; that 


M K 
Minimizes 5 D Oi 


i=0 k=1 
subject to the constraints 

K M K 
(1) > Vig — a > > YxPylk) = By forj =0,1,...,M, 


M 
where f; are given constants such that 6, > 0 and 5 b; = 1,! and 
, = 


(2) Yg = 0, i=0,1,..., M, k=1,2,..., K. 
If we consider the policy defined by 
Dig 


ll 


P{decision = kistate = i} 
Yik 
= =F ; 
Drai Yik 
then the y; can be interpreted as a weighted (in a discounted sense) expected time of 
being in state i and making decision k, when P{X) = j} = £; that is, if 


z% = P{at time n, state = i and decision = k}, 
then Yn = Oe + azh + aza + Oz toe 


Again, it can be shown that the optimal policy is deterministic; that is, D = 0 or 1. 
Furthermore, the technique is valid without the assumption that the Markov chain 
associated with every transition matrix is irreducible. 


1 The optimal policy is independent of the particular values of the 8;s, but, of course, the expected long- 
run total discounted cost is a function of these G,’s. 
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EXAMPLE: Returning to the machine-maintenance model (with a = 0.9), we can 
formulate the linear program as: 


Minimize  4,000yo. + 6,000y93 + 1,000y,, + 4,000y,, + 6,000y,, 
+ 3,000y,, + 4,000y,, + 6,000y + Myy3, + My32 + 6,000y35, 


where M, and M, are taken to be large numbers, subject to 


mle 


3 
> Yor ~ 9-9(¥03 + Yiz + Yo3 + Y33) = 


Bi 


3 
> Yik T 0.96 yo + Yo + tu + yi + Yn + Y32) = 


RIK 


3 
2 Do aes 0.96 Yo1 + eu + Zya) = 


Ble 


3 
2, Ysk T 0.9(T6 Yo: T au + BDA + Ys) 


and Y = O, i = 0, 1, 2 3, k = 1, 2, 3, 


where Bo, Bı; B2, and B; are arbitrarily chosen to be 4. 
The optimal solution yields all y equal to zero, except for yọ) = 1.210, y = 
6.656, yo. = 1.067, and y3, = 1.067. The corresponding 


Yk 
Dg Sa 
a Yik 


are given by Dy, = Dau = Da = Dz, = 1, and all the remaining Dz, = 0. This 
solution is the same as that obtained earlier in this section and calls for leaving the 
machine as is when it is in state O or 1, overhauling it when it is in state 2, and 
replacing it when it is in state 3. The minimized value of the objective function is 
$17,325, and it is seen to be related to the V’s of the optimal policy found in the 
discussion of the discounted cost policy improvement algorithm. Because P{X,) = j} 
was chosen to equal 4 for all j, 


17,325 = 4[V,(R) + V\(R) + VR) + V;(R)] 


Il 


4[14,949 + 16,262 + 18,636 + 19,454]. 


Finite-Period Markovian Decision Processes 
and the Method of Successive Approximations 


Chapter 11 introduced the concept of dynamic programming and characterized deter- 
ministic dynamic programming problems and their solutions. Many of these concepts 
have analogous interpretations with Markovian decision processes. In particular, sup- 
pose we seek the expected total discounted cost of a system starting in state i and 
evolving for n time periods when an optimal policy is followed. Note that a finite 
number of time periods are now being considered. This problem is analogous to 
deterministic dynamic programming, except that the Markov system evolves according 
to some probabilistic laws of motion rather than evolving-in a deterministic fashion. 
The deterministic dynamic programming solution is suggestive of the solution to this 


probabilistic dynamic programming problem. Denote by V? the expected total dis- 
counted cost of a system starting in state i and evolving for n time periods when an 
optimal policy is followed.! Using the principle of optimization, it follows that this 
cost function satisfies the recursive relationship, 


M 
vir! = min fe, +a Š sovy), i=0,1,...,M. 
ja 


Using this recursive relationship, the solution procedure moves backward period by 
period—each time finding the optimal policy for that period model—until it finds the 
optimal policy for the original problem. In particular, it is usually assumed that 
V6, V9, ..., Vy, is zero, so that V} can be obtained from 


V! = min {Ca}, i=0,1,....M, 
k 


with the corresponding optimal decision becoming known. If this optimal policy is 
followed, V} is the minimum expected total discounted cost of a system starting in 
state i and evolving for one time period. 

The V? can now be obtained from 


M 
V? = min fe, +a> piovi}, i=0,1,...,M, 
k j=0 


with the corresponding optimal decisions becoming known. If this optimal policy is 
followed, V? is the minimum expected total discounted cost of a system starting in 
state i and evolving for two time periods. 

In a similar manner, the V7 can be obtained from 


M 
V? = min fe, +a>, vv}, i=0,1,...,M, 
k j=0 


with the corresponding optimal decisions becoming known. If this optimal policy is 
followed, V7 is the minimum expected total discounted cost of a system starting in 
state i and evolving for T time periods. Thus, solving a five-period problem requires 
solving the four-, three-, two-, and one-period problems also. An example of a three- 
period version of the machine-maintenance model will be solved later in this section. 
It should be noted that œ can be set equal to 1 (no discounting) for finite-period 
problems, in which case the cost criterion becomes the expected total cost. 

Until now this section has dealt with a finite-period version of a Markov decision 
process. When the criterion of discounted costs is used, it can be shown that the V7? 
converges to V, as n approaches infinity, where V; is the expected (long-run) total 
discounted cost of a system starting in state i and continuing indefinitely when an 
optimal policy is followed and satisfies 


M 
a min {cx + a Š pon}, i=0,1,..., M. 
k j=0 


! This notation V" for cost is now being used instead of the notation introduced in Chap. 11 to be consistent 
with the material introduced in the current chapter. In accordance with the notation of Chap. 11, the 
subscript (i) is the state variable, and the superscript (n) is equivalent to the stage, except that the stage is 
now measured by the system having ‘‘n periods to go” rather than being in period n. This change is due 
to the need to treat the infinite-period problem also. 
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Furthermore, we obtain the optimal policy by making appropriate decisions to mini- 
mize the right-hand side of the. preceding equation. This solution is, indeed, the 
solution to the Markov decision process considered throughout earlier sections of this 
chapter. 

Finding V, and the corresponding optimal decisions is generally difficult, but it 
is relatively simple to approximate V, and obtain the corresponding policy. This is 
what the method of successive approximations does. It uses the recursive relationship 
of the finite-period problem presented earlier; that is, 


M 
5 j= 


The first step is to choose arbitrarily a set of values, V8, V9, V9, .. . , Vy), usually 
taken to be zero (as will be assumed from here on). Using the expression for 
yV?+1, V! can be obtained from 


Vi = min {C} i=0,1,...,M, 
k 


with the corresponding decisions becoming known. This step can be viewed as the 
first approximation to the optimal policy, and as noted earlier the V} can be interpreted 
as the expected total discounted cost of a system starting in state i and evolving for 
one period when an optimal policy is followed. 

The next iteration uses the Vj, Vi, V4, . . . , V}, found from the previous step. 
From the recurrence relationship presented, V7 can be obtained from 


M 
vV? = min fes +a J, pvi}, 
k j=0 


with the corresponding decisions becoming known. This policy can be viewed as the 
second approximation to the optimal policy, and, as noted earlier, the V? can be 
interpreted. as the expected total discounted cost of a system starting in state 7 and 
evolving for two periods when an optimal policy is followed. 

Further iterations can be obtained by using the recursive relationship. For the 
Tth iteration, V? can be interpreted as the expected total discounted cost of a system 
Starting in state i and evolving for T periods when an optimal policy is followed. The 
number T can be made large, and V7 will become ‘‘close’’ to the optimal expected 
long-run total discounted cost, and,. for sufficiently large T, the optimal policy will 
be obtained. However, there is no procedure for deciding when to terminate the 
method of successive approximations. A check can be made at any time to see whether 
the current iteration satisfies the policy improvement equations; if it does, then an 
optimal policy has been obtained. 

Although the method of successive approximations may not lead to an optimal 
policy (using a finite number of iterations), it has one distinct advantage over the 
policy improvement and linear programming techniques: It never requires the solution 
of a system of simultaneous equations, and hence each iteration can be performed 
simply and quickly. 


EXAMPLE: We shall solve the machine-maintenance model (a = 0.9) by the method 
of successive approximations. Let V3 = V? = V} = V} = 0. Then 


Vi = min {Co} = 0 (k = 1) 
k 


| 


V! = min {C} = 1,000 (k= 1) 
k 


V} = min {Ca} = 3,000 (k= 1) 
k 


| 


Vi = min {Ca} = 6,000 (k = 3). 
k 


Thus the first approximation calls for making decision 1 (leave the machine alone) 
when the system is in state 0, 1, or 2. When the system is in state 3, decision 3 
(replace) is made. 

The second iteration leads to 


V2 = min {0 + 0.9[%(1,000) + 75(3,000) + 75(6,000)], 
4,000 + 0.9[1(1,000)], 6,000 + 0.9[1(0)]} = 1,294 (k= 1) 


V? = min {1,000 + 0.9[3(1,000) + 4(3,000) + 4(6,000)], 

4,000 + 0.9[1(1,000)], 6,000 + 0.9[1(0)]} = 2,688 (k= 1) 
V3 = min {3,000 + 0.9[3(3,000) + 4(6,000)], 

4,000 + 0.9[1(1,000)], 6,000 + 0.9[1(0)]} = 4,900 (k = 2) 
V2 = 6,000 + 0.9[1(0)] = 6,000 (k= 3). 


Thus the second approximation calls for leaving the machine as is when it is in state 

O or 1, overhauling it when it is in state 2, and replacing it when it is in state 3. Note 

that this policy is the optimal one, even though the optimal cost has not been obtained. 
The third iteration leads to 


V3 = min {0 + 0.9[§(2,688) + 75(4,900) + #s(6,000)], 
4,000 + 0.9[1(2,688)], 6,000 + 0.9[1(1,294)]} = 2,730 (k= 1) 


V3 = min {1,000 + 0.9[$(2,688) + (4,900) + $(6,000)], 

4,000 + 0.9[1(2,688)], 6,000 + 0.9[1(1,294)]} = 4,041 (k= 1) 
V3 = min {3,000 + 0.9[3(4,900) + 2(6,000)], 

4,000 + 0.9[1(2,688)], 6,000 + 0.9[1(1,294)]} = 6,419 (k= 2) 
V3 = 6,000 + 0.9[1(1,294)]} = 7,165 (k = 3). 


Again the optimal policy is achieved, and the costs are getting closer to those of the 
optimal policy. This procedure can be continued, and Vj, V7, V3, and V3 will con- 
verge to 14,949, 16,262, 18,636, and 19,454, respectively. It should be noted that 
termination of the method of successive approximations after the second iteration 
would have resulted in an optimal policy, although there is no way to know this fact 
without solving the problem by other methods. 

As indicated earlier, the method of successive approximations solves a finite- 
period Markovian decision problem. In particular, the optimal solution to the one- 
period machine-maintenance model calls for leaving the machine alone when it is in 
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state 0, 1, or 2, and replacing it when it is in state 3. The minimum expected total 
discounted cost of the system starting in state i, i = 0, 1, 2, 3, and evolving for one 
period is given by 0, 1,000, 3,000, and 6,000, respectively. The optimal solution to 
the two-period machine-maintenance model is 


Period 1 Leave machine alone when it is in state O or 1. 
Overhaul machine when it is in state 2. 
Replace machine when it is in state 3. 


Period 2 Leave machine alone when it is in state 0, 1, or 2. 
Replace machine when it is in state 3. 


The minimum expected total discounted cost of the system starting in state i, i = 0, 
1, 2, 3, and evolving for two periods is given by 1,294, 2,688, 4,900, and 6,000, 
respectively. Finally, the optimal solution to the three-period model is 


Period 1 Leave machine alone when it is in state 0 or 1. 
and Overhaul machine when it is in state 2. 

Period 2 Replace machine when it is in state 3. 

Period 3 Leave machine alone when it is in state 0, 1, or 2. 


Replace machine when it is in state 3. 


The minimum expected total discounted costs over three periods, if the system starts 
in state i, i = 0, 1, 2, 3, are given by. 2,730, 4,041, 6,419, and 7,165, respectively. 


20.6 A Water-Resource Model 


A multipurpose dam is used for generating electric power as well as for flood control. 
The capacity of the dam is 3 units. The probability distribution of the quantity of 
water, W, that flows into the dam during month ¢ (fort = 0, 1, .. .) is given by 
Py(m), where 





Py(0) = P{W = 0} = 
Py(l) = P{W = I} = 
Py(2) = P{W = 2} = 
Py(3) = P{W = 3} = 





a wl wl oe 


For the purpose of generating electric power, 1 unit of water is required. At the 
beginning of each month, water is released from the dam. The first unit is used to 
generate electric power and then used for irrigation purposes, the latter function being 
worth $100,000. If additional units are released, they can also be used for irrigation 
purposes, and each unit is worth $100,000. If the dam contains less than 1 unit at the 
beginning of a month, additional power must be purchased at a cost of $300,000. If 
at any time the water in the dam exceeds the capacity of 3 units, the excess water is 
released through the spillways at no cost or gain. 

A release policy is sought. Policies are to be compared on the basis of expected 
discounted cost, with discount factor a = 0.99. The policy improvement algorithm 
will be used. 


Let X, denote the amount of water in the dam at time t. Then X, = 0, 1, 2, 3. 793 


The natural laws of motion for this system (no water released) are given by the Markovian Decision 
transition matrix: Processes and 
Applications 


State | 


o 





O ai cof oop 1 DO 
m oer bop Mp | GO 


1 
3 
è 
0 
0 





For example, the element in the second row and fourth column, pj3, is obtained as 
follows: If the dam contains 1 unit of water now, then for it to contain 3 units of 
water a month later, 2 or 3 units of water must flow into the dam during the month 
(recall that dam capacity is 3 units, so that a flow of 3 units will result in 1 unit being 
released through the spillways). This occurs with probability $ + § = 2. 

There are three possible decisions that can be made at the beginning of each 


month: 


Decision Action 





1 Release 1 unit 
Release 2 units 
3 Release 3 units 





It is clear that releasing no units is not a sensible action, because 1 unit is needed for 
electric power generation anyway. Thus a policy calls for determining how many units 
to release as a function of the quantity of water found in the dam. A typical policy 
R, might call for releasing all the water in the dam if it contains 0, 1, or 2 units, and 
releasing 2 units if it contains 3 units. The resultant transition matrix is given by 





State | 0 1 2 3 
0 4 4 $ 
1 $ 4 3 4 
2 + 4 4 4 
3 o ¢ 4 ł 





Of course, a policy that calls for releasing 3 units when there is only 1 unit in 
the dam is to be interpreted as calling for releasing all the available water. Necessary 
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cost information can be obtained from the following data: 


Decision. Cost (In Hundred Thousands) 





1 3 
2 3 
3 3 
1 -1 
2 —1 
3 =a] 
1 -1 
2 =2 
3 -2 
1 =] 
2 =2 
3 -3 


The policy R, will be used in the value-determination step (step 1) of the policy 
improvement algorithm. Using the cost information just given, the values of C are 


Co, = 3 
Cyl 
Cy, = —2 
C = 72. 


The following four equations must be solved: 
VR) = 3 + 0.9918V(R,) + $V,(R,) + $V(R) + $V3(R,)] 
VR) = —1 + 0.99[8V(R,) + 3V,(R) + 3VAR) + VRDI 
VAR) = —2 + 0.99[6V(R,) + SVR) + 3V(R) + EVR )] 
VR) = —2 + 0.99[ EVIR) + SVAR) + 2V3(R)I- 
The simultaneous solution of these equations results in the values 
VR) = — 103.881, V,(R, = — 107.881, 
and VR) = — 108.881, V3(R,) = — 110.358. 


Step 2 can now be applied. We want to find an improved policy R, that has the 
property that d)(R,) = k$, d\(R,) = ki, d,(R.) = kå, and d,(R,) = kå minimizes the 
following expressions: 


(0) Corg + 0-99[ — 103.88 Lpoo(k3) — 107.881 po1(k2) 
— 108.881 po9(k2) — 110.358p93(k3)] 


(1) Cy, + 0.99[— 103.881 py9(k}) — 107.881p,,(K4) 
— 108.881p,.(k}) — 110.358p,,(k2)] 


(2) Cyyg + 0.99. — 103.881p99(K3) — 107.881p,(K3) 
— 108.881p,,(k3) — 110.358p,4(k3)] 


(3). Cag + 0.99[ — 103.881 p39(k3) — 107.881p3)(K3) 
— 108.881p3,(k3) — 110.358p33(k3)]. 


To find ke the best decision when the system is in state 0, it is necessary to 
evaluate the first expression for all possible decisions. It is clear that when the system 
is in state 0 (dam empty), there is no choice among the decisions because they all are 
equivalent. The data for the necessary calculations for evaluating expression (0) 
follow: 


State 0 







Total Value 
of Expression 0 





~ 103.881 


Similarly, for state 1 there is no choice among the decisions because they are all 
equivalent. The data for the necessary calculations for evaluating follow: 


State 1 







Total Value 
of Expression 1 








Decision 


Piolk2) Pilko) 





— 107.881 


For the remaining two states, the appropriate transition probabilities and costs 
generally depend upon the decisions made. The data for the necessary calculations 
for finding the best decisions, given the dam is in state 2 or 3, follow: 































State 2 
Total Value 
Decision | Pak) | Pak) | Pak) | Pastk)e „ | of Expression 2 
1 - 109.358 
2,3 — 108.881 
State 3 
Total Value 
Decision | Pso(K2) | Psk) | Psa(k2) | Pss(k2) | Czy, | of Expression 3 





~ 110.011 
. — 110.358 
— 109.881 
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Thus d(R,) = k9 = d,(R,) = k} = 1, 2, or 3; d,(R,) = k2 = 1; and d,(R,) = 
k3 = 2. Hence policy R, calls for releasing all the water when there is 1 unit in the 
dam, 1 unit of water when there are 2 units available in the dam, and 2 units when 
there are 3 units available in the dam. This policy differs from R,, so that another 
iteration is required. For the value-determination step, the equations that must now 
be solved are 


VoRa) = 3 + 0.99[6Vo(R.) + 3Vi(Ro) + 3V2(R2) + §V3(R)] 
ViR) = —1 + 0.996 VR) + $V,(Ry) + $V2(R2) + V3(R2)] 
VR) = —1 + 0.99[ 6V,(Ry) + 3VA(R2) + 2V5(R)] 
V(R>) = —2 + 0.99[ EVIR) + FVR) + 2V3(RQ)I- 


The simultaneous solution of these equations results in the values V)(R,) = 
— 119.642, V,(R,) = — 123.642, V,(R,) = — 125.119, and V;(R,) = — 126.119. 

Step 2 can now be applied. We want to find an improved policy R, that has the 
property that d)(R3) = K$, d,(R3) = k}, d,(R3) = k3, and d,(R;) = k3 minimizes the 
following expressions: 


(0) Cog + 0.99, — 119.64 2po9(K§) — 123.642po,(K2) 
125.119p9(k3) — 126.119p93(k3)] 


(1) Cy + 0.99[— 119.642p,(4) — 123.642p,(K:) 
— 125.119p,9(k4) 


126.119p,3(K2)] 


(2) Cog + 0.99[— 119.642 p99(K3) — 123.642p,,(K3) 
— 125.119p59(k2) 


126.119p3(k3)] 


(3) Cag + 0.99[—119.642p3(K3) — 123.642p,,(8) 
125.119 p9(k3) 


126.119p53(k3)]. 


The data on the transition matrices and the costs from the previous iteration can 
again be used; the resulting values of the expression are 





Value of Value of Value of Value of 
Decision | Expression0 | Expression 1 | Expression 2 | Expression 3 
1 ~ 119.642 — 123.642 ~ 125.119 — 125.693 

2 — 119.642 — 123.642 — 124.642 — 126.119 


3 — 119.642 — 123.642 — 124.642 — 125.642 





Thus d(R3) = k$ = d (R) = k} = 1, 2, or 3; a(R) = k = 1; and 
d,(R;) = kå = 2. Hence policy R, and policy R, are identical, and the optimal release 
policy calls for releasing all the water when there is 1 unit in the dam, 1 unit of water 
when there are 2 units available in the dam, and 2 units when there are 3 units available 
in the dam. 

Of course, direct enumeration would have been just as simple a technique to 
use in this situation, but the policy improvement algorithm was used for illustrative 
purposes. 


20.7 Inventory Model 


In Chap. 15 the following inventory problem was considered. A camera store stocks 
a particular model camera that can be ordered weekly. Let D,, D3, . . . , represent 
the demand for this camera during the first week, the second week, . . . , respectively. 
It is assumed that the D; are independent, identically distributed random variables 
having a Poisson distribution with parameter A equal to 1. Let Xo represent the number 
of cameras on hand at the outset, X, the number of cameras on hand at the end of 
week one, X, the number of cameras on hand at the end of week two, and so forth. 
On Saturday night the store places an order that is delivered in time for the opening 
of the store on Monday. The store uses an (s, S) ordering policy. If the number of 
cameras on hand at the end of the week is less than s = 1 (no cameras in stock), the 
store orders up to S = 3. Otherwise, the store does not order (if there are any cameras 
in stock, no order is placed). It is assumed that sales are lost when demand exceeds 
the inventory on hand (no backlogging). The cost structure considered calls for in- 
curring a penalty cost of $50 per unit for each unit of unsatisfied demand (lost sales). 
If z > 0 cameras are ordered, the cost incurred is 10 + 25z dollars. If no cameras 
are ordered, no ordering cost is incurred. Holding costs are to be neglected. In Sec. 
15.7, this policy was evaluated by using the (long-run) expected average cost per unit 
time as the criterion. It is not evident that this policy is optimal, and the purpose of 
this section is to find the optimal policy. Even though we know that the optimal policy 
must be of the (s, S) form, we shall consider all possible policies, although we shall 
assume that three cameras is the maximum number of cameras that the store will 
stock. The policy improvement algorithm will be used first, followed by the linear 
programming formulation. 

Because X, represents the state of the system, i.e., the number of cameras on 
hand at the end of week ¢ (before ordering), then X, = 0, 1, 2, 3. Similarly, there 
are four possible decisions: 






Decision 







Do not order 
Order 1 camera 
Order 2 cameras 
Order 3 cameras 


wn 


The possible transitions are given by! 





Decision 0 
State | 0 1 2 3 
0 1 0 0 0 
1 P{D = 1} P{D = 0} 0 0 
2 P{D = 2} P{D = 1} P{D = 0} 0 
3 P{D = 3} P{D = 2} P{D = 1} P{D = 0} 





1 Note that in this example the set of possible decisions varies with the states, 
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Decision 1 
1 


P{D = 0} 


State 














P{D = 1} 0 





PID = 0} 


1 PiD=2} P{D = 1} 
2 PiD=3} PiD=2} PLD = 1} 
3 Decision 1 not permitted 











Decision 2 
State | 0 1 2 3 
0 PiD=2} PD=1 P{D= 0 
PiD=3} PiD=% PiD= P{D = 0} 






Decision 3 
State | 0 1 2 





Decision 2 not permitted 









P{D >23} P{D = 2} 


Recalling that the demand D is a Poisson random variable with parameter A = 1, 


PID = 1} 
Decision 3 not permitted 


P{D = 0} 


and using appendix Table A.5.4, these transitions can now be expressed as 


Decision 0 
1 


0 0 
0.632 0.368 0 
0.264 0.368 0.368 
0.184 0.368 










Decision 2 
0 1 2 3 
0.264 0.368 0.368 0 


0.080 0.184 0.368 0.368 
Decision 2 not permitted 





The cost information required is similar to that given in Sec. 15.7, and you are 


Decision 1 
1 2 


0.632 . 0.368 0 0 
0.264 0.368 0.368 0 
0.080 0.184 0.368 0.368 
Decision 1 not permitted 















one 


Decision 3 
State 0 1 2 3 
0 0.080... 0.184 0.368 0.368 
1, 2,3 Decision 3 not permitted 





urged to review this material. A summary is given by 


50D 

35 + 50 max {(D — 1), 0} 
60 + 50 max {(D ~ 2), 0} 
85 + 50 max {(D — 3), 0} 


50 max {(D — 1), 0} 
35 + 50 max {(D — 2), 0} 
60 + 50 max {(D — 3), 0} 


Decision 3 not 
permitted 
50 max {(D — 2), 0} 
35 + 50 max {(D — 3), 0} 
Decisions 2, 3 not 
permitted. 
50 max {(D — 3), 0} 
Decisions 1, 2, 3 not 
permitted 





Expected Cost Per Week, Cg 





SOE(D) = 50 


35 + 50[1P{D = 2} + 2P{D = 3} +--> 
60 + 50[1P{D = 3} + 2P{D = 44 +--- 
85 + 50[1P{D = 44 + 2P{D = 5} +--- 

S50[1P{D = 2} + 2P{D = 
35 + SO[IP{D = 3} + 2P{D = 
60 + 50[1P{D = 4} + 2P{D = 


SOLP{D = 3} + 2P{D = 4} +- 
35 + 50[1P{D = 4} + 2P{(D = 5} +--- 


SO[IP{D = 4} + 2P{D = S}+--- 


4p+-- 
5S} ++: 


Bete: 


Choose the (s, S) policy already introduced as the initial policy for carrying out 
the value-determination step (step 1) of the policy improvement algorithm. This policy, 
R,, calls for ordering up to 3 units whenever the system is in state 0 (no cameras on 
hand); otherwise, no order is placed. With this policy, the following four equations 
must be solved simultaneously for g(R,), Uo(R,), Vi(Rı), and v,(R,) [recall that v3(R,) 
is arbitrarily taken to be zero]: 


or, alternatively, 


a(R) = 


8(Rı) = Cox, 


3 


Tor 


FF Ciz 


3 
= Cok, 


W 


= C3k, + > 
jx 


5.2 + 0.264u9(R,) + 0.368v,(R,) + 0.368vfR,) — vR) 


Pilko; (R) = 


= 18.4 + 0.632u,(R,) + 0.368v,(R,) 


3 
+ 2 Pojlkyv,(Ry) — vR) 


viR) 


+ > py(k)u(R) — vfR) 
2 


P3(ky)v; (R) — vR), 


86.2 + 0.080u,(R,) + 0.1840 (R) + 0.3680,(R,) — VaR) 


~ v(R,) 


1.2 + 0.080u,(R,) + 0.1840,(R,) + 0.3680,(R,). 
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The simultaneous solution of this system of equations yields 
g\(R)) = 31.43 
Up(Ry) = 85.00 
v,(R,) = 64.38 
vR) = 31.49. 
Step 2 can now be applied. It is necessary to find the improved policy R,, which 


has the property that d)(R,) = k}, d\(Rj) = ki, d(R,) = k3, and d,(R,) = k 
minimizes the following expressions: 


(0) Cog + Poo(kS)85 + Pos(K3)64.38 + po(k9)31.49 — 85 
(1) Cig + Pio(k})85 + pi(k})64.38 + pi(k})31.49 — 64.38 
(2) Cag + pag(K3)85 + pai(k3)64.38 + pz(K3)31.49 — 31.49 
(3) Cag + Paolk)85 + pay(K3)64.38 + p5p(K3)31.49. 


To find the optimal decisions, the following data are required: 


State 0 






Total Value of 
Expression 0 













Decision | Polk) | Poo) | Polk) 


















0 1 0 50 

1 0.632 0.368 45.81 
2 0.264 0.368 37.92 
3 


0.080 0.184 


State I 






Total Value of 
Expression 1 















Decision | polk) | Puk) | Poka) 


0.368 0 
0.264 0.368 0.368 40.2 
0.184 0.368 









33.54 





State 2 






Total Value of 
Expression 2 











Decision | p(k) | Pak) | Pak) 


State 3 







Total Value of 
Expression 3 











Decision | pPao(k2) 


0.080 


P31(K2) | P32(k2) 














0 31.43 





Thus d,(R,) = ko = 3, d,(R,) = k} = 2, d(R) = k} = dR.) = k3 = 0. Hence 
policy R, calls for ordering up to three cameras whenever there is 0 or 1 camera in 
stock; otherwise, no ordering is done; i.e., if the number of cameras on hand at the 
end of the week is less than s = 2 cameras, the store orders up to $ = 3 cameras. 
Because policy R, differs from policy R,, another iteration is required. The following 
four equations must be solved simultaneously for g(R,), Uo(R2), vi(R2), and v,(R,): 


(Ry) = 86.2 + 0.080u,(R) + 0.1840 (R) + 0.368v,(R>) — v9(R>) 
61.2 + 0.080,(Rz) + 0.184v,(R,) + 0.368v,(R,) — v,(R>) 
= 5.2 + 0.264u,(R,) + 0.368v,(R,) + 0.368v,(R>) — vR) 
1.2 + 0.080u)(R,) + 0.184v,(R,) + 0.3680,(R,). 


l 


The simultaneous solution of this system of equations yields 


g(Ry) = 30.33 
vo(R>) = 85.00 
v(Ry) = 60.00 
va(R,) = 30.68. 


Step 2 can now be applied. It is necessary to find the improved policy R}, which 
has the property that d)(R;) = k$, d,(R3) = k}, dR) = k3, and d,(R,) = k3 min- 
imizes the following expressions: 


(0) Cog + Poo(k$)85 + Poi(k5)60 + pos(k$)30.68 — 85 
a) Ci + pio(k})85 + pii(k$)60 + pix(k})30.68 — 60 
(2) Cag + Po(k3)85 + pi(k3)60 + prp(k3)30.68 — 30.68 
(3) Crug + P30(k3)85 + p31(k3)60 + p3:(k3)30.68. 


Using the data from the previous iteration, the relevant calculations are 



















Total Value of 
Expression 3 


30.33 


Total Value of 
Expression 0 


Total Value of 
Expression | 


34.20 


36.01 
30.33 





Total Value of 
Expression 2 


30.33 
34.65 


Decision 
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Thus d,(R3) = k$ = 3, d\(R3) = k} = 2, d,(R;) = k} = d3(R,) = k3 = 0. Hence 

policy R, and policy R, are identical, so that the optimal policy calls for ordering up 

to three cameras when there is 0 or 1 camera in stock; otherwise, no ordering is done. 
The linear programming formulation calls for finding the y; that 


Minimize 500g + 53.499, + 65.2yYo2 + 86.2¥93 + 18.419 + 40.2y;, 
+ 61.2y;9 + 5.2¥o9 + 36.2¥2, + 1.259, 


subject to 
Yoo + Yor + Yor + Yos + Yio + Yur + Yiz + Yæ + Yar + Yao = l, 
Yoo + Yor + Yor + Yos — [Yoo + YorlO.632) + yo{0.264) + yo3(0.080) 
+ y9(0.632) + y,,(0.264) + y1.(0.080) + yoo(0.264) 
+ Yo1(0.080) + y39(0.080)] = 0, 
Yio + Y + yi2 — Lyo1(0.368) + yo(0.368) + yo3(0.184) + y,.(0.368) 
+ y1(0.368) + y,(0.184) + yo9(0.368) + yo(0.184) + y39(0.184)] = 0, 
Yoo + Yor — [Y0(0.368) + yo3(0.368) + (0.368) + y(0.368) 
+ yo9(0.368) + yo)(0.368) + yaq(0.368)] 
Yso — L¥o3(0.368) + y (0.368) + y2\(0.368) + y3o(0.368)] 


0, 
0, 


and Yoo» Yor» Yoz: Yoz: Yio Yir Yiz Y20» Yas Y3o = O. 


This linear program can be solved by using the simplex method. The results 
yield all y; equal to zero, except for 


Yos = 0.148, Yı = 0.252, Y2 = 0.368, Ya = 0.233. 
The corresponding D; are given by 
Do = Dy = Dy = Dy = 1, 


and all the remaining D, = 0. 


20.8 Conclusions 


The material presented in this chapter represents a powerful tool for formulating 
models and finding the optimal policies for controlling a large class of systems — those 
that are Markovian decision processes. These techniques are applicable to the solution 
of problems in such areas as queueing theory, inventory, maintenance, and probabi- 
listic dynamic programming, in general. 

Two algorithms were presented, the policy improvement algorithm and the linear 
programming formulation, for finding optimal policies. It is evident from the examples 
that data-collection requirements are high. Even if the solution converges rapidly in 
the policy improvement algorithm, completing step 2 requires considerable calculation 
for systems with a large number of states. Using the linear programming formulation 
with, say, 50 states and 25 decisions leads to 1,250 variables and 51 constraints 
(excluding the nonnegativity constraints), which represents a large linear program. 
Nevertheless, these two solution methods are useful for solving real-world problems. 


When the cost criterion is the expected discounted cost, the method of successive 
approximations provides a valuable tool for approximating the optimal solution. Much 
simpler calculations are required for this algorithm than for the policy improvement 
or linear programming methods. 

Considerable research activities have been devoted to the field of Markovian 
decision processes in recent years. C. Derman! has shown that in the expected average 
cost per unit time case, the optimal policy is deterministic (calls for always taking a 
particular action when the system is in a given state). Similarly, he has shown? that 
in the expected discounted cost case, the optimal policy is also deterministic. The 
policy improvement algorithm is due to R. Howard.’ For the expected average cost 
per unit time case, he presents not only the algorithm for the situation where all the 
states belong to one class but also an algorithm that is applicable when there is more 
than one (a finite number) class of states. He also considers the continuous time case. 
The linear programming formulation using the expected average cost per unit time 
was first given by A. S. Manne,* who treated the case where all the states belong to 
one class. The linear programming formulation using the expected discounted cost 
was first given by F. d’Epenoux.° Finally, although the results presented in this chapter 
assumed the state space to be finite, most of the results are applicable to the case of 
a countable state space. 
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people, he enters the facility and becomes an actual customer. The manager of the facility has 
two types of service rates available. If she uses her ‘‘slow’’ service rate at a cost of $3 during 
a period, a customer will be served and leave the facility with probability $. If she uses her 
“‘fast’’ service rate at a cost of $9 during a period, a customer will be served and leave the 
facility with probability 4. Note that the probability of more than one customer arriving or more 
than one customer being served in a period is zero. A profit of $50 is earned when a customer 
is served. 

(a) Use exhaustive enumeration to identify all the stationary deterministic policies. 

(b) Use the policy improvement algorithm to determine the policy the manager should 

follow to minimize the expected long-run average cost per period. 

(Hint: In computing the costs for services when two customers are at a facility, do not 

forget the opportunity cost of losing a potential customer.) 


2.* Formulate Prob. 1 as a linear programming problem. 


3. A person often finds. that she is up to-1 hour late for work. If she is from 1 to 30 
minutes late, $4 is deducted from her paycheck; if she is from 31 to 60 minutes late for work, 
$8 is deducted from her paycheck. If she drives to work at her normal speed. (which is well 
under the speed limit), she can arrive in 20 minutes. However, if she exceeds the speed limit 
a little here and there on her way to work, she can get there in 10 minutes, but she runs the 
risk of getting a speeding ticket. With probability $ she will get caught speeding and will not 
only be fined $20 but will also be delayed 10 minutes, so that it takes 20 minutes to reach 
work. 

Let s be the time she finds she has to reach work before being late; that is, s = 10 means 
she has 10 minutes to get to work and s = — 10 means she is already 10 minutes late for work. 
For simplicity, she considers s to be in one of four intervals: (20, ©), (10, 19), (— 10, 9), and 
(—20, —11). 

The transition probabilities for s tomorrow if she does not speed today are given by 


(20,%) (10,19) (-10,9) (-20, ~11) 














The transition probabilities for s tomorrow if she speeds to work today are given by 


(20,0) (10,19) (-10,9) (-20, —11) 


(20, œ) 

(10, 19) 
(—10, 9) 
(-20, —11) 

















Note that there are no transition probabilities for (20, ©) and (— 10, 9), because she will 
get to work on time and from 1 to 30 minutes late, respectively, regardless of whether she 
speeds or not. Hence speeding when in these states would not be a logical choice. 

Also note that the transition probabilities imply that the later she is for work and the 
more she has to rush to get there, the likelier she is to leave for work earlier the next day. 

Use the policy improvement algorithm to determine when she should speed and when 
she should take her time getting to work. 


4. Formulate Prob. 3 as a linear programming problem. 


5. Every Saturday night a man plays poker, much to the dismay of his wife. Regardless 
of the kind of mood his wife is in, if he takes her out to dinner (at an expected cost of $14) 
before going to play poker, she will be in a good mood, with probability $, and a bad mood, 
with probability å, next Saturday night. However, if he goes to play poker without taking her 
out to dinner first, she will be in a good mood next Saturday, with probability $, and a bad 
mood, with probability $, regardless of her mood this week. Furthermore, if she happens to be 
in a bad mood and he does not take her to dinner, she will go to an exclusive store and buy a 
new outfit (at an expected cost of $75). Use the policy improvement algorithm to find the policy 
that the man should follow to minimize his long-run expected average cost per week. 


6. Formulate Prob. 5 as a linear programming problem. 


7.* When a tennis player serves, he gets two chances to serve in bounds. If he fails to 
do so twice, he loses the point (1 unit). If he attempts to serve an ace, he serves in bounds, 
with probability 3. If he serves a lob, he serves in bounds, with probability §. If he serves an 
ace in bounds, he wins the point (1 unit), with probability 3. With an inbounds lob, he wins 
the point (1 unit), with probability 4. Use the policy improvement algorithm to determine the 
optimal strategy. (Hint: Let state 0 denote point over, two serves to go on next point; and let 
state 1 denote one serve left.) 


8.* Formulate Prob. 7 as a linear programming problem. 


9. A student is concerned about her car and does not like to get it dented. When she 
drives to school she has a choice of parking it on the street in one space, parking it on the 
street and taking up two spaces, or parking in the lot. If she parks on the street in one space, 
her car gets dented, with probability 75. If she parks on the street and takes two spaces, the 
probability of a dent is 3y and the probability of a $15 ticket is 7%. Parking in a lot costs $5, 
but the car will not get dented. If her car gets dented, she can have it repaired at the dealer, 
in which case it is out of commission for a day and costs her $50 in fees and cab fares. She 
can also drive her car dented, but she feels that the loss of pride and shame is worth about $9 
a day. Using the policy improvement algorithm, determine the optimal policy. 


10. Formulate Prob. 9 as a linear programming problem. 


11. Each year Mr. Merrill has the chance to invest in two different no-load mutual 
funds: the Go-Go Fund or the Go-Slow Mutual Fund. At the end of each year, Mr. Merrill 
liquidates his holdings, takes his profits, and then reinvests. The yearly profits of the mutual 
funds are dependent. upon how the market reacts each year. Recently the market has been 
oscillating around the 2,500 mark, according to the probabilities given in the following matrix: 


2,400 2,500 2,600 
2,400} 0.3 0.5 0.2 
2,500, 0.1 0.5 0.4 
2,600} 0.2 0.4 0.4 


Each year the market moves up (or down) 100 points, the Go-Go Fund has profits (or losses) 
of $20, while the Go-Slow Fund has profits (or losses) of $10. If the market moves up (or 
down) 200 points in a year, the Go-Go Fund has profits (or losses) of $50, while the Go-Slow 
Fund has profits (or losses) of only $20. If the market does not change, there is no profit or 
loss for either fund. Use the policy improvement algorithm to determine how Mr. Merrill should 
invest each year. 


12, Formulate Prob. 11 as a linear programming problem. 


13. Consider the blood-inventory problem presented in Chap. 15, Prob. 15. Suppose 
now that the number of pints of blood delivered (on a regular delivery) can be specified at the 
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time of delivery (instead of using the old policy of receiving one pint at each delivery). Thus, 
the number of pints delivered can be 0, 1, 2, or 3 (more than 3 pints can never be used). The 
cost of regular delivery is $50 per pint, while the cost of an emergency delivery is $100 per 
pint. Starting with the proposed policy given in Chap. 15, Prob. 15, perform two iterations of 
the policy improvement algorithm. 


14. Formulate Prob. 13 as a linear programming problem. 


15. A soap company specializes in a luxury type of bath soap. The sales of this soap 
fluctuate between two levels—‘‘low’’ and ‘‘high’’—depending upon two factors: (1) whether 
they advertise or not, and (2) the advertising and marketing of new products by competitors. 
The second factor is out of the company’s control, but they are trying to determine what their 
own advertising policy should be. For example, the marketing manager’s proposal is to ad- 
vertise when sales are low but not to advertise when sales are high (a particular policy). 
Advertising in any quarter of a year has its primary impact on sales in the following quarter. 
Therefore, at the beginning of each quarter, the needed information is available to forecast 
accurately whether sales will be low or high that quarter and to decide whether to advertise 
that quarter. 

The cost of advertising is $1,000,000 for each quarter of a year in which it is done. 
When advertising during a quarter, the probability of having high sales the next quarter is $ or 
2, depending upon whether the current quarter’s sales are high or low. These probabilities go 
down to 4 or $ when advertising is not done during the current quarter. The company’s quarterly 
profits (excluding advertising costs) are $4,000,000 when sales are high but only $2,000,000 
when sales are low. 

(a) Use exhaustive enumeration to identify all the stationary deterministic policies. Eval- 

uate the expected average cost per quarter for the marketing manager’s proposal. 

(b) Use the policy improvement algorithm to find the optimal policy. 


16. Formulate Prob. 15 as a linear programming problem. 


17. Consider an infinite-period inventory problem involving a single product where, at 
the beginning of each period, a decision must be made about how may items to produce during 
that period. The setup cost is $10, and the unit production cost is $5. The storage cost for each 
item not sold during the period is $4 (a maximum of 2 items can be stored). The demand during 
each period has a known probability distribution, namely, a probability of 4 of 0, 1, and 2 
items, respectively. If the demand exceeds the supply available during the period, then those 
sales are lost and a penalty cost- (including lost revenue) is incurred, namely, $8 and $32 for a 
shortage of 1 and 2 items, respectively. 

(a) Consider the (s, S) policy, (s, S) = (1, 2), where enough items are produced to 
raise the current inventory level to 2 items if, and only if, the inventory level at the 
beginning of a period is less than 1 item; i.e., no items are present. Determine the 
(long-run) expected average cost per period for this policy. In finding the transition 
matrix for the Markov chain for this policy, let the states represent the inventory 
levels at the beginning of the period. 

(b) Use exhaustive enumeration to identify all the stationary deterministic inventory 
policies. 


18. Use the policy improvement algorithm to find the optimal policy in Prob. 17. 
19. Formulate Prob. 17 as a linear programming problem. 


20. Buck and Bill Bogus are twin brothers who work at a gas station and have a 
counterfeiting business on the side. Each day a decision is made as to who will go to work at 
the gas station, and who will stay home and run the printing press in the basement. Each day 
that the machine works properly, it is estimated that 60 usable $20 bills can be produced. 
However, the machine is somewhat unreliable and breaks down frequently. If the machine is 
not working at the beginning of the day, Buck can have it in working order by the beginning 


of the next day with probability 0.6. If Bill works on the machine, the probability decreases 
to 0.5. If Bill operates the machine when it is working, the probability is $ that it will still be 
working at the beginning of the next day. If Buck operates the machine, it breaks down with 
probability $. (Assume for simplicity that all breakdowns occur at the end of the day.) 
(a) Formulate the problem of minimizing costs, i.e., maximizing the value of the Bogus 
brothers’ labor, as a Markov decision process. (Specify states, decisions, transition 
matrices, and C;,’s.) 
(b) Starting with the policy of always having Buck stay home to work with/on the 
machine, use the policy improvement algorithm to determine the optimal policy. 


21. Formulate Prob. 20 as a linear programming problem. 


22.* Suppose a person wants to dispose of a car. He receives an offer each month and 
must decide immediately whether or not to accept the offer. Once rejected, the offer is lost. 
The possible offers are $600, $800, and $1,000, made with probabilities 3, 4, and $, respectively 
(it may be assumed that successive offers are independent of each other). Suppose that there 
is a maintenance cost of $60 per month and that a discount factor of a = 0.95 is specified. 
Using the policy improvement algorithm, find a policy that minimizes the expected long-run 
total discounted cost. (Hint: There are two actions: Accept or reject the offer. Let the state 
space at time ¢ denote the offer at time ¢, augmented by the state œ. The process goes to state 
æ whenever an offer is accepted, and it remains there at a monthly cost of 0.) Find the optimal 
policy using the policy improvement algorithm. 


23.* Formulate Prob. 22 as a linear programming problem, assuming that the initial 
states are equally likely. 


24.* In Prob. 22, use three iterations of the method of successive approximations to 
approximate the optimal solution. 


25. The price of a certain stock is fluctuating among the prices $10, $20, and $30 from 
month to month. Market analysts have predicted that if the stock is at $10 during any month, 
it will be at $10 or $20 next month, with probabilities 4 and 4, respectively; if the stock is at 
$20, it will be at $10, $20, or $30 next month, with probabilities 4, 4, and $, respectively; and 
if the stock is at $30, it will be at $20 or $30 next month, with probabilities ł and 4, respectively. 
Given a discount factor of 0.9, use the policy improvement algorithm to determine when to sell 
and when to hold the stock to maximize the expected long-run total discounted profits. (Hint: 
Augment the state space with a state that is reached with probability 1 when stock is sold and 
with probability 0 when the stock is held.) 


26. Formulate Prob. 25 as a linear programming problem. 


27. In Prob. 25, use three iterations of the method of successive approximations to 
approximate the optimal solution. 


28. A person is in the market for a house. Until she finds one she lives in a hotel for 
$70 a day. When she buys one, she pays immediately and moves in the next day. She can look 
at a house (if she chooses to) at most once a day, and when she does look at a house, she pays 
a broker’s fee of $50. The houses can cost $140,000, $170,000, and $200,000, and these costs 
each occur with probability 3 on any given day when she looks at a house. There is a daily 
discount factor of 0.999. Use the policy improvement algorithm to find the optimal policy. 


29. Formulate Prob. 28 as a linear programming problem. 


30. In Prob. 28, use three iterations of the method of successive approximations to 
approximate the optimal solution. 


31.* A farmer raises corn. Each year that he has a successful crop he grosses $17,000 
on expenses of $6,000 for seed and labor. Sometimes his crop fails and he grosses only $8,000. 
Each year the farmer has a chance of using two types of fertilizer: Type A, at a cost of $2,000, 
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guarantees that there is a 60 percent chance of having a successful crop the next year; type B, 
at a cost of $3,000, guarantees that there is an 80 percent chance of a good crop next year. 
Using the policy improvement algorithm with a discount factor of 0.5, determine when the 
farmer should use fertilizers A and B. 


32.* Formulate Prob. 31 as a linear programming problem. 


33.* In Prob. 31, use three iterations of the method of successive approximations to 
approximate the optimal solution. 


34. A chemical company produces two chemicals, denoted by 0 and 1. Each month a 
decision is made as to which chemical to produce that month. Because the demand for each 
chemical is predictable, it is known that if 1 is produced this month, there is.a 70 percent 
chance that it will also be produced again next month. Similarly, if 0 is produced this month, 
there is only a 20 percent chance that it will be produced again next month. 

To combat the emissions of pollutants, the chemical company has two processes, process 
A, which is efficient in combating the pollution from the production of 1 but not from 0, and 
process B, which is efficient in combating the pollution from the production of 0 but not from 
1, The amount of pollution from the production of each chemical under both processes is 


0 1 


Unfortunately, there is a time delay in setting up the pollution-control processes, so that 
a decision as to which process to use must be made in the month prior to the production 
decision. 
(a) Use exhaustive enumeration to identify all the stationary deterministic policies. 
(b) Use the policy improvement algorithm to determine a pollution-control policy that 
will minimize the present value of all future pollution at a discount factor of a = 
0.5. 


A 
B 





35. Formulate Prob. 34 as a linear programming problem. 


36. In Prob. 34, use two iterations of the method of successive approximations to ap- 
proximate the optimal solution. 


37. A man is playing a slot machine at $1 per play. Each time that he wins the jackpot 
of $10, he finds that if he pulls the lever hard, he has a 10 percent chance of winning on the 
next round. But if he pulls the lever gently, he has a 20 percent chance of hitting the jackpot 
on the next round. If he loses and then pulls the lever hard, he has an 80 percent chance of 
losing on the next round also. But if he pulls the lever gently, he has a 95 percent chance of 
losing again. Use the policy improvement algorithm with a discount factor of 0.9 to determine 
whether the player should pull the lever hard or gently. 


38. Formulate Prob. 37 as a linear programming problem. 


39. In Prob. 37, use two iterations of the method of successive approximations to ap- 
proximate the optimal solution. 


40. Solve Prob. 28 as a four-period model. 

41.* Solve Prob. 31 as a four-period model. 
42. Solve Prob. 34 as a three-period model. 
43. Solve Prob. 37 as a three-period model. 


44. Formulate the water-resource model presented in Sec. 20.6 as a linear programming 
problem. 


45. Use three iterations of the method of successive approximations to approximate the 
optimal solution to the water-resource model presented in Sec. 20.6. 


46. Solve the inventory model presented in Sec. 20.7 using the expected discounted 
cost as the cost criterion, with a discount factor of a = 0.90. 


47. Use three iterations of the method of successive approximations to approximate the 
optimal solution to Prob. 46. 


48.* Find the optimal solution to a four-period machine-maintenance model using the 
data from the example presented in this chapter. (Use a discount factor of a = 0.90.) 


49. Solve the inventory model presented in Sec. 20.7 as a four-period problem. 


50. Solve the water-resource model presented in Sec. 20.6 as a three-period problem: 
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Reliability 


21.1 Introduction 


The many definitions of reliability that exist depend upon the viewpoint of the user. 
However, they all have a common core that contains the statement that reliability, 
R(t), is the probability that a device performs adequately over the interval [0, ¢]. In 
general, it is assumed that unless repair or replacement occurs, adequate performance 
at time ¢ implies adequate performance during the interval [0, t]. The device under 
consideration may be an entire system, a subsystem, or a component.’ Although this 
definition is simple, the systems to which it is applied are generally very complex. In 
principle, it is possible to break down the system into black boxes, with each black 
box being in one of two states: good or bad. Mathematical models of the system can 
then be abstracted from the physical processes and the theory of combinatorial prob- 
ability used to predict the reliability of the system. The black boxes may be inde- 


’ A subsystem can be viewed as containing one or more components. 
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pendent of, or be very dependent upon, each other. For any reasonable system, such 
a probability analysis generally becomes so cumbersome that it must be considered 
impractical. Hence we seek other methods that either simplify the calculations or 
provide bounds on the reliability of the entire complex system. 

As an example, consider an automobile. There are a large number of functional 
parts, wiring, and joints. These may be broken into subsystems, with each subsystem 
having a reliability associated with it. Possible subsystems are the engine, transmis- 
sion, exhaust, body, carburetor, and brakes. A mathematical model of the automobile 
system can be abstracted and the theory of combinatorial probability used to predict 
the reliability of the automobile. 


21.2 Structural Function of a System 


Suppose an automobile can be divided into n components (subsystems). The per- 
formance of each component can be denoted by a random variable, X,, that takes on 
the value x; = 1 if the component performs satisfactorily for the desired time and 
x; = 0 if the component fails during this time. In general, then, X; is a binary random 
variable defined by 


x= 1, if component i performs satisfactorily during time [0, t] 
: 0, if component i fails during time [0, ¢]. 


The performance of the system is measured by the binary random variable! 
P(X, X, -© ©, Xp), where 
1, if system performs satisfactorily during time [0, 7] 


OO Morena) i if system fails during time [0, ^. 
The function @ is called the structure function of the system and is just a function of 
the #-component random variables. Thus the performance of the automobile is a 
function of its n components and takes on the value 1 if the automobile functions 
properly for the desired time and 0 if it does not. Because the performance of each 
component in the automobile takes on the value 1 or 0, then the function œ is defined 
over 2” points, with each point resulting in a 1 if the automobile performs satisfactorily 
and a 0 if the automobile fails. 

There are several important structure functions to consider, depending upon how 
the components are assembled. Three structure functions will be discussed in detail. 


Series System 


The series system is the simplest and most common of all the configurations. For a 
series system, the system fails if any component of the system fails; i.e., it performs 
satisfactorily if and only if all the components perform satisfactorily. The structure 
function for a series system is given by 


Wn X>,...,X,) XL X, = minfX,, X%,..., Xi}. 


This equation holds because each X; is either 1 or 0. Hence the structure function 
takes on the value 1 if each X, equals 1 or, alternatively, if the minimum of the X; 


1 Note that X; and @ are functions of the time t, but t will be suppressed for ease of notation. 
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equals 1. For example, suppose the automobile is divided into only two components: 
the. engine (X,) and the transmission (X,). Then it is reasonable to assume that the 
automobile will perform satisfactorily for the desired time period if and only if the 
engine and the transmission both perform satisfactorily. Hence 


P(X, X) = XX, 
and (1,1) = 1, (1, 0} = (0, 1) = (0,0) = 0. 


Parallel System 


A parallel system of n components is defined to be a system that fails if all components 
fail, or alternatively, a system that performs satisfactorily if at least one of the n 
components performs satisfactorily (with all n components. operating simultaneously). 
This property of parallel systems is often called redundancy (i.e., there are alternative 
components, existing within the system, to help the system operate successfully in 
case of failure of one or more components). The structure function for a parallel 
system is given by 


Zn Xa- X) =1-A- xr -X)---a- xX, 


max{X,, Xo... 4X, 


This equation again follows because each X; is either 1 or 0. The structure function 
takes on the value 1 if at least one of the X; equals 1 or, alternatively, if the largest 
X; equals 1. In the automobile example, the car is equipped with front disk (X,) and 
rear drum (X,) brakes. The automobile will perform successfully if either the front or 
rear brakes operate properly.! If one is concerned with the structure function of the 
brake subsystem, then 


&(X, X%) = 1-0 - X)d — X,) = X, +% - XX, 
and (1,1) = 01, 0) = 40, 2D = 1, $0, 0) = 0. 


k Out of n System 


Some systems are assembled such that the system operates if k out of n components 
function properly. Note that the series system is a k out of n system, with k = n, 
and the parallel system is a k out of n system, with k = 1. The structure function for 
a k out of n system is given by 


is “HED Sk 

i=l 

Xo Xoo Xp) = 
0 if) X,<k 


i 
E 


In the automobile example, consider a large truck equipped with eight tires. The 
structure function for the tire system is an example of a four out of eight system. 
(Although the system’s performance may be degraded if fewer than eight tires are 


1 Tt is evident that the loss of the front or rear brakes will affect the braking capability of the automobile, 
but the definition of ‘‘perform successfully’’ may allow for either set working. 


operating, rearrangement of the tire configuration will result in adequate performance 
as long as at least four tires are usable.) 

It is reasonable to expect the performance of an automobile to improve if the 
performance of one or more components is improved. This improvement can be 
reflected in the characterization of the structure function, where, for example, one 
would expect #(1, 0, 0, 1) to be no less than (1, 0, 0, 0). Hence it will be assumed 
that if x; = y,, fori = 1,2,...,, then 


p(y Y2» GEY Yn) = P(X), X2, toe aX): 


A system possessing this property (ġ is an increasing function of x) is called a co- 
herent (or monotone) system. 


21.3 System Reliability 


The structure function of a system containing n components is a binary random vari- 
able that takes on the value 1 or 0. Furthermore, the reliability of this system can be 
expressed as! 


R = PIX, X,...,X,) = 1). 


Thus, for a series system, the reliability is given by 





R = PX,X,---X, = 1} = PX = 1,%,=1,...,X, = lh. 


n 


When the usual terms for conditional probability are employed, 
R = PIX, = DPX = 1X, = 1PX = 1X = 1,% = 1} 
set PE l Sa Gor = i 





In general, such conditional probabilities require careful analysis. For example, 
P{X, = 1|X, = 1} is the probability that component 2 will perform successfully, 
given that component 1 performs successfully. Consider a system where the heat from 
component 1 affects the temperature of component 2 and thereby its probability of 
success. The performance of these components is then dependent, and the evaluation 
of the conditional probability is extremely difficult. If, on the other hand, the per- 
formance characteristics of these components do not interact, e.g., the temperature of 
one component does not affect the performance of the other component, then the 
components can be said to be independent. The expression for the reliability then 
simplifies and becomes 


R = PIX, = PIX, = 1} ++ P(X, = 1}. 


When the components of a series system are assumed to be independent, it should be 
noted that the reliability is a function of the probability distribution of the X,. This 
phenomenon is true for any system structure. 

Unless otherwise specified, it will be assumed throughout the remainder of this 
chapter that the component performances are independent. Hence the probability dis- 


' The time f is now suppressed in the notation. Recall that the time is implicitly included in determining 
whether or not the ith component performs satisfactorily. 
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tribution of the binary random variables. X; can be expressed as 
PIX, = 1} = p, 
and PIX; = 0 = 1 — p; 


Thus, for systems composed of independent components, the reliability becomes a 
function of the p,; that is, 


R = R(p,, Pas- + -> Pn) 


Reliability of Series Systems 
As previously indicated, for a series structure, 

ROPi Po - -o Pah) = POR, Xano X = VY 
PIXX X = l} 


n 


ll 


Il 


PIX, =1,X,=1,...,X,= 1} 


PIX, 


P(X, = 1}--- PAX, = L} 


= PiP2°> °° Pr 


Thus, returning to the automobile example, if the probability that the engine performs 
satisfactorily is 0.95 and the probability that the transmission performs satisfactorily 
is 0.99, then the reliability of this automobile series subsystem is given by R = 
(0.95)(0.99) = 0.94. 


Reliability of Parallel Systems 
The structure function for a parallel system is 


@(X,, Xz, ...,X,) = max(X,, X,,..., X,), 
and the reliability is given by 


R(py, Pas - - + , Pa) = Plmax(X%,, X,,...,X,) = 1} 


| 


= 1 — Pfall X, = 0} 


1 — PIX, = 0,X,=0,...,X, = 0} 


1— (1 — pp) — po)+ +> Ud = p,). 


Thus, if the probability that the front disk brakes and the rear drum brakes perform 
satisfactorily is 0.99 for each, the subsystem reliability is given by 


R= 1 — (0.01)(0.01) = 0.9999. 


Reliability of k Out of n Systems 


The structure function for a k out of n system is 


l, 


E 
iM» 
>d 
IV 
Pal 


AC are Coe Xn) = 


i 
1 


et 

Ms 
> 
A 
m 


0, 


and the reliability is given by 
RCP); Po + + + Pa) = p> X; = e} 


The evaluation of this expression is, in general, quite difficult except for the 
case of pi = pa = +++ = p, = p. Under this assumption, 57, X; has a binomial 
distribution with parameters n and p, so that 


rf{n\, . 
R(p. Pp, ....p) = > i) ple p. 
For the truck tire example, if each tire has a probability of 0.95 of performing satis- 
factorily, then the reliability of a four out of eight system is given by 
8 
8 . ; 
R=) (*) (0.95)(0.05)®-' = 0.9999. 
i=4 \1 
For general structures, the system reliability calculations can become quite te- 
dious. A technique for computing reliabilities for this general case will be presented 
in the next section. However, the final result of this section is to indicate that the 
reliability function of a system of independent components can be shown to be an 
increasing function of the p,; that is, if p; = q; fori = 1,2,...,n, then 


RQ, Gao +--+ > Qn) = R(p;, Pr. s Pn) 


This result is analogous to, and dependent upon, the assumption that the structure 
function of the system is coherent. The implication of this intuitive result is that the 
reliability of the automobile will improve if the reliability of one or more components 
is improved. 


21.4 Calculation of Exact System Reliability 


A representation of the structure of a system can be expressed in terms of a network, 
and some of the material presented in Chap. 10 is relevant. For example, consider 
the system that can be represented by the network in Fig. 21.1. This system consists 
of five components, connected in a somewhat complex manner. According to the 
network diagram, the system will operate successfully if there exists a flow from A 
(source) to D (sink) through the directed graph, i.e., if components 1 and 4 operate 
successfully, or components 2 and 5 operate successfully, or components 1, 3, and 5 
operate successfully. In fact, each arc can be viewed as having capacity 1 or 0, 





Figure 21.1 A five-component system. 
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depending upon whether or not the component is operating. If an arc has a 0 attached 
to it (the component fails), then the network would lose that arc, and the system would 
operate successfully if and only if there was a path from the source to the sink in the 
resultant network. This situation is illustrated in Fig. 21.2, where the system still 
operates if components 3 and 4 fail but becomes inoperable if components 2, 3, and 
4 fail. This suggests a possible method for computing the exact system reliability. 
Again, denote the performance of the ith component by the binary random variable 
X;. Then X; takes on the value 1 with probability p; and O with probability (1 — p;). 
For each realization, X, = x,, X% = x, X% = X3, X4 = X4, and X; = x; (there are 
2° such realizations), it is determined whether or not the system will operate, i.e., 
whether or not the structure function equals 1. The network consisting of those arcs 
with X; equal to 1 contains at least one path if and only if the corresponding structure 
function equals |. If a path is formed, the probability of obtaining this configuration 
is obtained. For the realization in Fig. 21.2a, a path is formed, and 
PAX, = 1,X, = 1, X% = 0, X, = 0, X; = 1} = ppl — p) — paps- 

Because each realization is disjoint, the system reliability is just the sum of the 
probabilities of those realizations that contain a path. Unfortunately, even for this 
simple system, 32 different realizations must be evaluated, and other techniques are 
desirable. 

Another possible procedure for finding the exact reliability is to note that the 
reliability R(p,, Po, - - . , Pa) can be expressed as 


R(p,, Pas - - - » Pn) = P{maximum flow from source to sink = 1}. 


This identity allows the concept of paths and cuts presented in Chap. 10 to be used. 
In reliability theory, the terminology of minimal paths and minimal cuts is introduced. 
A minimal path is a minimal set of components that, by functioning, ensures the 
successful operation of the system. For the example in Fig. 21.1, components 2 and 
5 are a minimal path. A minimal cut is a minimal set of components that, by failing, 
ensures the failure of the system. In Fig. 21.1, components 1 and 2 are a minimal 
cut. For the system given in Fig. 21.1, the minimal paths and cuts are 


Minimal Paths | Minimal Cuts 
XX, XX) 
XXX; XX; 
XX5 X,X3Xq 

XX; 








(a) (b) 
Figure 21.2 (a) System with components 3 and 4 failed; (b) system with components 2, 3, and 4 
failed. 


If we use all the minimal paths, there are two ways to obtain the exact system 817 
reliability. Because the system will operate if all the components in at least one of Reliability 
the minimal paths operate, the system reliability can be expressed as 


RPI, Pa» P3» Pas Ps) = P{ P(X, Xa, X3, X4, X5) = 1} 
P{(X X4 = 1) U (X X;X; = 1) U (XX; = D}. 


Il 


Using the algebra of sets, 
R(Ppi; Po P3 Pas Ps) = PIXX, = 1} + P{X,X;X; = 1} 
—P{X X,X,X; = 1} — P{X XXX; = 1} 
+ P{X,X,X,X,X, = 1} 
= PPa + PiP3Ps + P2Ps — PiPsP4Ps 
—PiP2PaPs T PiP2P3Ps + PiP2P3P4Ps 
= 2p* + p — 3p* + p*, when p; = p. 
Notice that there are 2? — 1 = 7 terms in the expansion of the reliability function 
(in general, if there are r paths, then there are 2” — 1 terms in the expansion), so 
that this calculation is not simple. 
The second method of determining the system reliability from paths is as follows: 
For the minimal path containing components 1 and 4, X,X, = 1 if and only if both 
components function. This fact is similarly true for the other two minimal paths. 


However, the system will operate if all the components in at least one of the minimal 
paths operate. Hence paths operate as a parallel system, so that 


P(X, Xo, Xz, X4, Xs) = max[X,X4, XXX; XX5] 
= 1 — (1 — XX) — XXX; — X_X5). 
Because X? = X, then 


P(X, Xz, X3, X4, X5) = XXa + XXX; + X-X; — X,XzX 4X5 
— X,X,X4X5 — XXX,XX; + XXX3X 4X5. 
Noting that œ is a binary random variable taking on the values 1 and 0, 


E(X, Xa, X3, X4» Xs5)] E PiX. X», Xx Xy Xs) E 1} 


Il 


R(Pi, P2 Ps» Pa Ps)- 


Therefore, 


R(P,, Pz; P3 Par Ps) 
= E[X,X4 + XXX; + XX; — X X,X,X; — X X,X,X; 
— X, X2X3X; + X X-X:X4X;] 


= PıP4 + PPPs + PPs — PiP3P4Ps — PiP2PaPs — P1P2P3Ps 
+ PiP2P3PaPs- 


This result is the same as the one obtained earlier and requires essentially the same 
amount of calculation. 

If we use all the minimal cuts, there are also two ways to obtain the exact system 
reliability. Because the system will fail if and only if all the components in at least 
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mona" Rs Po Pa Pe Ps) = 1 — PPK Xa, Xa, Xy Xs) = OF 

1 — P(X, = 0, X, = 0) U (X, = 0, X; = 0) 

U Œ, = 0, X = 0, X, = 0) U X; = 0, X, = 0)} 

1 — PIX, = 0, X, = 0} — P{X, = 0, X; = 0} 

— P{X, = 0, X; = 0, X, = 0} — P{X, = 0, X; = 0} 

















+ PIX, = 0,X, = 0, X, = 0, X; = 0} 

+ P{X, = 0, X, = 0, X; = 0, X, = 0} 

+ P{X, = 0, X, = 0, X; = 0} 

+ PIX, = 0, X; = 0, X, = 0, X; = 0} 

+ P{X, = 0, X, = 0, X; = 0} 

+ PIX, = 0, X, = 0,X; = 0, X, = 0, X; = 0} 
— P{X, = 0, X, = 0, X, = 0, X, = 0, X; = 0} 
— P{X, = 0, X, = 0, X, = 0, X; = 0} 

— PIX, = 0, X, = 0, X, = 0, X, = 0, X; = 0} 
~ P{X, = 0, X, = 0, X; = 0, X, = 0, X, = 0} 
+ P{X, = 0, X, = 0, X; = 0, X, = 0, X; = 0} 





= 1 — qh — qals ~ 429344 = 4195 + 91929344 
+ 19095 + 92939495 + qı4495 — 9192934495; 


where q; = 1- pi 


This result is, of course, algebraically equivalent to the one obtained previously, and 
it involves 24 — 1 = 15 terms in the expansion of the reliability function. In general, 
if there are s cuts, there are 2° — 1 terms in the expansion. 

The second method of determining the system reliability from cuts is: For the 
minimal cut containing components 1 and 2, 1 — (1 — X,)(1 — X2) = 0 if and only 
if both components fail. This fact is similarly true for the other three cuts. However, 
the system will operate if at least one of the components in each cut operates. Hence 
cuts operate as a series system, so that 





P(X, Xp, Xa, Xy X5) 
= min{l — (1 — XDA — X), 1 — (A = XA — Xs), 
1— (1 — X,) — XA — X), 1- A- XU — X) 


=(i-d- Xd i X,)){1 == Xd cae X5)] 
[1 - a- X0 - Xd — XI a — Xd — X) 


Sh (oS A = X) ed — Xp - X,) 
= Cl X2) a X3) ig X4) o N Xd z Xs) 
+ (1 — XDA — XU — XU — X4 
+ (1 g Xd gi Xə) T Xs) 
+ (1 — X,)(1 — X — Xp — Xs) 
+ (1 — X) — X) — Xs) 
=- (1 — X,)(L -XDA — XX — XA — X3). 





Taking expectations on both sides leads to the desired expression for the reli- 
ability. Again, this method requires essentially the same amount of calculation as 
required for the first procedure using cuts. 

Although the results presented in this section were based upon the example, an 


extension to any system can be easily obtained. All minimal paths and/or cuts must 
be found and one of the four methods presented chosen. 

AS previously mentioned, if there are r paths and s cuts in the network, then 
calculating the exact reliability using paths will involve summing 2” — 1 terms, and 
using cuts will involve 2° — 1 terms. Hence the method using paths should be used 
if and only if r = s. Generally, however, it is simpler to find minimal paths rather 
than minimal cuts, so that the method using paths may have to be used because finding 
all cuts may be computationally infeasible. It is evident that finding the exact reliability 
of a system is quite difficult and that bounds are desirable, provided that the calcu- 
lations are substantially reduced. 


21.5 Bounds on System Reliability 


It is evident that the calculations required to compute exact system reliability are 
numerous, and that other methods, such as obtaining upper and lower bounds, are 
desirable. 

There exists a well-known result concerning binary random variables, i.e.: 


If Xi, Xz, ... , X, are independent binary random variables that take on the 
value 1 or 0, and Y, = Uje,,X;, where the product ranges over all j that 
are elements in the set J, i = 1,2, ..., 1, then 


PAY, = 0, Y, =0,...,¥, = 0}= PAY, = OP{Y, = 0} -- + Ply. = Ob. 


Returning to the example of Sec. 21.4, it was pointed out that the system will operate 
if all the components in at least one of the minimal paths operate, so that 


R(P1; Pas P3; Pas Ps) = P{P(Xy, Xz, X3, Xa Xs) = 1} 
1 — P{all paths fail} 
= 1 — P{X,X, = 0, XXX; = 0, XX, = O}. 


Il 


From the aforementioned result on binary random variables, 


R( Py, Pz Pas Pa Ps) = 1 — P{X\X4 = OP{X XX, = O}P(X,X5 = 0} 


ll 


1 — (1 — ppal — pip3ps)1 — Paps) 
1- (= př- p). 
when Pi = P, 


so that an upper bound is obtained. Similarly, in Sec. 21.4, it was pointed out that 
the system will operate if at least one of the components in each cut operates, so that 


Rp1; Pa» P3> Pa Ps) 
= P{h(X,, Xz, X3, X4, Xs) = 1} = Pf{at least one of X,, X, operates; at 
least one of X,, X, operates; at least one of X}, X3, X, operates; at 
least one of X,, X, operates} 


PIN =- 0 GL X) S= ol LD aA- XA o- a l, 
0-0 = X0 = X -XAS n = XA S- X Es Ty 


P1 — XD — X) = 0, - X) - X.) = 0, 
a - Xa) = XX1 = Xa) =0,(1 - Xx) = Xs) = O}. 


Il 
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Now (1 — X;) are independent binary random variables that take on the values 1 and 
0, so that the result on binary random variables is again applicable; that is, 


R(P\; P2» P3» Par Ps) 
= (P{1 — X,)(1 — X) = P0 — XY — X) = 0} 
PX Ti X,)0 - X; = X4) R O}P{(1 — Xa a Xs) a 0} 
= (1 - G — ppd -pDl - ( -p0 - ps) 
{1 — (A = py) — p31 — pall - A — ppd — p) 
= [1 — (1 — přřu - Gd — p)’, 
when Pi = P, 


so that a lower bound is obtained. 
Thus we obtain an upper bound on the reliability based upon paths and a lower 
bound based upon cuts. For example, if p; = p = 0.9, then 


0.9693 = [1 — (0.1)’P[1 — (0.1)7] = R(0.9, 0.9, 0.9, 0.9, 0.9) 
s1 — [1 — (0.9711 — ©.9)7] = 0.9902. 
Furthermore, the exact reliability obtained from the expressions in Sec. 21.4 is given 
by 
R(0.9, 0.9, 0.9, 0.9, 0.9) = (0.9)? + (0.9)? — 3(0.9)* + (0.9)? = 0.9712. 


In general, this technique provides useful results in that the bounds are frequently 
quite narrow. 


21.6 Bounds on Reliability Based upon Failure Times 


The previous sections considered systems that. performed successfully during a des- 
ignated period or failed during this same period. An alternative way of viewing sys- 
tems is to view their performance as a function of time. 

Consider a component (or system) and its associated random variable, the time 
to failure, T. Denote the probability distribution of the time to failure of the component 
by F and its density function by f. In terms of the previous discussion, the random 
variables X and T are related in that X takes on the values 


1, ifT=tr 
0, ifT<t. 
Then RO = P{X = 1} = 1 — F = | f(y) dy. 


An appealing intuitive property in reliability is the failure rate. The failure rate 
r(t) is defined for those values of t for which F(t) < 1 by 
fO 
A ee 
r(t) RO 
This function has a useful probabilistic interpretation; namely, r(t) dt represents the 
conditional probability that an object surviving to age t will fail in the interval 
[t,t + dt]. This function is sometimes called the hazard rate. 


In many applications, there is every reason to believe that the failure rate tends 
to increase because of the inevitable deterioration that occurs. Such a failure rate that 
remains constant or increases with age is said to have an increasing failure rate 
(IFR). 

In some applications, the failure rate tends to decrease. It would be expected to 
decrease initially, for instance, for materials that exhibit the phenomenon of work 
hardening. Certain solid-state electronic devices are also believed to have a decreasing 
failure rate. Thus a failure rate that remains constant or decreases with age is said to 
have a decreasing failure rate (DFR). 

The failure rate possesses some interesting properties. The time to failure dis- 
tribution is completely determined by the failure rate. In particular, it is easily shown 
that 


R(t) = 1 — F(t) = exp |- f r(€) a 


Thus an assumption made about the failure rate has direct implications on the time to 
failure distribution. As an example, consider a component whose failure distribution 
is given by the exponential; i.e., 


Fit) = PPTs h= 1 -— eV, 


—1/0 


Thus R(t) is given by e~””, and the failure rate is given by 


_ U/e""? 1 


r(t) 7 Ey 


Note that the exponential has a constant failure rate and hence has both IFR and DFR. 
In fact, using the expression relating the time to failure distribution and the failure 
rate, it is evident that a component having a constant failure rate must have a time to 
failure distribution that is exponential. 


Bounds for IFR Distributions 


Under either the IFR or DFR assumption, it is possible to obtain sharp bounds on the 
reliability in terms of moments and percentiles: In particular, such bounds can be 
derived from statements based upon the mean time to failure. This fact is particularly 
important because many design engineers present specifications in terms of mean time 
to failure. 

Because the exponential distribution with constant failure rate is the boundary 
distribution between IFR and DFR distributions, it provides natural bounds on the 
survival probability of IFR and DFR distributions. In particular, it can be shown that 
if all that is known about the failure distribution is that it is IFR and has mean p, 
then the greatest lower bound on the reliability that can be given is 


eH, fort <p 


R) = 
i fe for t = p, 


and the inequality is sharp; i.e., the exponential distribution with mean y attains the 
lower bound for t < u, and the degenerate distribution concentrating at u attains the 
lower bound for t = u. This situation can be represented graphically as shown in Fig. 
21.3. 
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t 
Figure 21.3 A lower bound on reliability for IFR distributions. 


The least upper bound on R(t) that can be obtained if we know only that F is 
IFR with mean p is given by 


1, 
RQ) = { - 
e E] 


where w depends on f and satisfies 1 — wu = e7*. It is important to note that the 
w in the term e~” is a function of t, so that a different œw must be found for each t. 
For fixed ¢ and u, this w is obtained by finding the intersection of the linear function 
(1 — wp) and the exponential function e~°’. It can be shown that for t > u, such 
an intersection always exists. 

Thus R(t) for an IFR distribution with mean u can be bounded above and below, 
as shown in Fig. 21.4. Note that the lower bound is the only one of consequence for 
t < p, and that the upper bound is the only one of consequence for t > u. 


fort=p 
fort > p, 


Increasing Failure Rate Average 


Now that bounds on the reliability of a component have been obtained, what can be 
said about the preservation of monotone failure rate; i.e., what structures have the 
IFR property when their individual components have this property? Series structures 
of independent IFR (DFR) components are also IFR (DFR). k out of n structures 
consisting of n identical independent components, each having an IFR failure distri- 
bution, are also IFR; however, parallel structures of independent IFR components are 
not IFR unless they are composed of identical components. Thus it is evident that, 
even for some simple systems, there may not be a preservation of the monotone failure 
rate. 
Instead of using the failure rate as a means for characterizing the reliability, 


R(t) = exp |-f r(é) al, 


a somewhat less appealing characterization can be obtained from the failure-rate aver- 
age function, 


f EdE _ log RO) 
o t to 


es 





ently eu 
Upper 
~~ bound 
b4 
Lower ' 
bound i 
| 
0 i — 


t 
Figure 21.4 Upper and lower bounds on reliability for IFR distributions. 


A distribution F such that F(0) = 0 is called increasing failure rate average (IFRA) 
if and only if 


joe 


0 t 


is nondecreasing in t = 0. A similar definition is given for DFRA. It can be shown 
that a coherent system of independent components, each of which has an IFRA failure 
distribution, has a system failure distribution that is also IFRA. 

As with IFR systems. there are bounds for IFRA systems. It can be easily shown 
that IFR distributions are also IFRA distributions (but not the reverse), and the same 
upper bound as given for IFR distributions is applicable here. A sharp lower bound 
for IFRA distributions with mean yp is given by 


0, fort =u 
R@) 2) _, 
eo”, fort < p, 


where b depends upon ż and is defined by e~” = b(u — 12). 

As an example, a monotone system containing only independent components, 
each of which is exponential (thereby IFRA), is itself IFRA, and the aforementioned 
bounds are applicable. Furthermore, these bounds are dependent only upon the system 
mean time to failure. 


21.7 Conclusions 


In recent years, the delivery of systems that perform adequately for a specified period 
of time in a given environment has become an important goal for both industry and 
government. In the space program, higher system reliability means the difference 
between life and death. In general, the cost of maintaining and/or repairing electronic 
equipment during the first year of operation often exceeds the purchase cost, giving 
impetus to the study and development of reliability techniques. 

This chapter has been concerned with determining system reliability (or bounds) 
from a knowledge of component reliability or characteristics of components, such as 
failure rate or mean time to failure. Even the desirable state of knowing these values 
may lead to cumbersome and sometimes crude results. However, it must be empha- 
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sized that these values, e.g., component reliability or mean time to failure, are not 
known and are often just the design engineers’ educated guesses. Furthermore, except 
in the case of the exponential distribution, knowledge of the mean time to failure 
leads to nothing but bounds. Also, it is evident that the reliability of components or 
systems depends heavily upon the failure rate, and the assumption of constant failure 
rate, which appears to be used frequently in practice, should not be made without 
careful analysis. 

The contents of the chapter have not been concerned with the statistical aspects 
of reliability, i.e., estimating reliability from test data. This subject was omitted 
because our emphasis is on probability models, but this is not a reflection on its 
importance. The statistical aspects of reliability may very well be the important prob- 
lem. Statistical estimation of component reliability is well in hand, but estimation of 
system reliability from component data is virtually an unsolved problem. 
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PROBLEMS 


1.* Show that the structure function for a three-component system that functions if and 
only if component 1 functions and at least one of components 2 or 3 functions, is given by 


A(X, X-X3) = X, max(X,, X3) 
= XH — (1 — X) — X3]. 

2. Show that the structure function for a four-component system that functions if and 
only if components 1 and 2 function and at least one of components 3 or 4 functions, is given 
by 

G(X, Xo, Xa, Xa) = XX, max(Xy, X4). 

3.* Find the reliability of the structure function given in Prob. 1 when each component 

has probability p; of performing successfully. 


4. Find the reliability of the structure function given in Prob. 2 when each component 
has probability p; of performing successfully. 


5. Consider a system consisting of three components (labeled 1, 2, 3) that operate 
simultaneously. The system is able to function satisfactorily as long as any two of the three 
components are still functioning satisfactorily. The goal is for the system to function satisfac- 
torily for a length of time ż, so the system’s reliability, R(t), is the probability that this will 
occur. The times until failure of the individual components are independently (but not identi- 
cally) distributed, where p, is the probability that the time until failure of component i exceeds 
t, fori = 1, 2, 3. 

(a) Is this a k out of n system? If so, what are k and n? 

(b) Draw a flow network representation of this system. 

(c) Develop an explicit expression for the structure function of this system. 

(d) Find R(t) as a function of the p,’s. 


6. Consider a system consisting of five components, labeled 1, 2, 3, 4, 5. The system 
is able to function satisfactorily as long as at least one of the following three combinations of 
components has every component in that combination functioning satisfactorily: 


(1) Components 1 and 4; 
(2) Components 2 and 5; 
(3) Components 2, 3, and 4. 


For a given amount of time f, let R(t) be the known reliability of component i (i = 1, 2, 3, 
4, 5), that is, the probability that this component will function satisfactorily for this length of 
time. Assume that the times until failure of the individual components are independently dis- 
tributed. Let R(t) be the unknown reliability of the overall system. 

(a) Draw a flow network representation of this system. 

(b) Develop an explicit expression for the structure function of this system. 

(c) Find R(t) as a function of the R,(1). 


7. Suppose that there exist three different types of components, with two units of each 
type. Each unit operates independently, and each type has probability p; of performing suc- 
cessfully. Either one or two systems can be built. One system can be assembled as follows: 
The two units of each type of component are put together in parallel, and the three types are 
then assembled to operate in series. Alternatively, two subsystems are assembled, each con- 
sisting of the three different types of components assembled in series. The final system is 
obtained by putting the two subsystems together in parallel. Which system has higher reliability? 


8.* Consider the following network. 





Assume that each component is independent with probability p; of performing satisfactorily. 
(a) Find all the minimal paths and cuts. 
(b) Compute the exact system reliability, and evaluate it when p, = p = 0.90. 
(c) Find upper and lower bounds on the reliability, and evaluate them when p; = p = 
0.90. 
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9. Solve Prob. 8 by using the following network. 





Note that component 3 flows in both directions. 


10. Solve Prob. 8 by using the following network. 


2 





12.* Suppose F is IFR, with u = 0.5. Find upper and lower bounds on R(t) for 
(a) t = f and (b) t = 1. 


13. A time to failure distribution is said to have a Weibull distribution if the cumulative 
distribution function is given by 


F(t) = 1 e n B> 0. 
Find the failure rate, and show that the Weibull distribution is IFR when B = 1 and DFR when 
0<ß=1. 


14. Suppose that a system consists of two different, but independent, components, 
arranged into a series system. Further, assume that the time to failure for each component is 
exponential with parameter 0,, i = 1, 2. Show that the distribution of the time to failure of the 
system is IFR. 


15. Consider a parallel system consisting of two independent components whose time 
to failure distributions are exponential with parameters u, and ua, respectively (uy, M3). 
Show that the time to failure distribution of the system is not IFR. 


RQ) = PIT, >torT, >= 1 — P{T,;StandT,< 4 
=1-(1- ey — e7”). 


16. For Prob. 15, show that the time to failure distribution is IFRA. 
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Decision Analysis 


22.1 Introduction 


In recent years, decision analysis has become an important technique in business, 
industry, and government. Decision analysis provides a rational methodology for 
decision making in the face of uncertainty. It enables a manager to choose among 
alternatives in an optimal fashion, taking into account the worth of acquiring exper- 
imental data to reduce the uncertainty. 

This chapter presents a framework for making decisions when (1) experimen- 
tation is infeasible and (2) experimentation is possible, resulting in the availability of 
sample data. The criterion of optimality used to select among alternatives will be the 
minimization of expected cost. Among the problems considered in this chapter are 
the following: What is the decision that minimizes expected cost, given the result of 
an experiment (if indeed an experiment is performed)? By following the optimal 
policy, what is the expected cost? If an experiment is performed, will it be worthwhile; 
that is, will the decrease in expected cost be more than the cost of the experiment? 
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Table 22.1 Table of Profits for Oil Company 
500,000- 200,000- 



















50,000- 
























Barrel Well | Barrel Well | Barrel Well | Dry Well 
Drill for oil 650,000 200,000 — 25,000 — 75,000 
Unconditional lease 45,000 45,000 45,000 45,000 
Conditional lease 250,000 100,000 0 0 


Finally, what is the maximum amount of money that might be spent in order to 
eliminate all of the uncertainty? 


EXAMPLE: Consider the following problem. An oil company owns some land that 
is purported to contain oil. The company classifies such land into four categories by 
the total number of barrels that are expected to be obtained from the well, i.e., a 
500,000-barrel well, a 200,000-barrel well, a 50,000-barrel well, or a dry well. The 
company is faced with deciding whether to drill for oil, to unconditionally lease the 
land to an independent oil driller, or to conditionally lease the land at a rate depending 
upon the oil strike. The cost of drilling a producing well is $100,000, and the cost of 
drilling a dry well is $75,000. For producing wells, the profit per barrel of oil is $1.50 
(after deducting all production costs). Under the unconditional lease agreement, the 
company receives $45,000 for the land, whereas under the conditional lease arrange- 
ment, the company receives 50 cents for each barrel of oil extracted, provided the 
land yields a 200,000- or 500,000-barrel strike; otherwise, it receives nothing. 
The possible profits for the oil company are shown in Table 22.1. 


22.2 Decision Making without Experimentation 


General Framework 


Before seeking a solution to the aforementioned problem, it is worthwhile to formulate 
a general framework for decision making. The decision maker must choose an action 
a from a set A of possible actions. In the oil-drilling example, the set A consists of 
three points, a,, a2, and az, that correspond to drilling for oil, unconditionally leasing 
the land, and conditionally leasing the land, respectively. In taking an action, the 
decision maker must be aware of its consequences, which will usually also be a 
function of the ‘‘state of nature.’’ A state of nature @ is a representation of the actual 
real-world situation to which the action will apply. Generally, the states of nature are 
an enumeration, within the model according to some set of indices, of possible alter- 
native representations of the physical phenomenon being studied. The set of possible 
values that 0 can assume will be denoted by O. In the oil-drilling example, © consists 
of four points, 01, 05, 03, and 04, with 6, corresponding to the land yielding a 500,000- 
barrel well, 6, corresponding to a 200,000-barrel well, @; corresponding to a 50,000- 
barrel well, and 6, corresponding to a dry well. Very often the states of nature are 
characterized by a parameter of a family of probability distributions. In the context 
of the oil-drilling example, the potential strikes might be viewed as the expected value 
of the random variable, oil yield, having some assumed form of probability distri- 
bution. Thus a representation of the model of this oil-drilling problem is that the oil 
yield in the site is a random variable with an unknown expected value. The company 
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Table 22.2 Loss Function for Oil-Drilling Example 
























State of | @,: 500,000- | Øs: 200.000- 
Action Barrel Well Barrel Well Barrel Welt 
a: drill for oil — 650,000 — 200,000 25,000 75,000 
a: unconditionally — 45,000 — 45,000 — 45,000 ~ 45,000 
lease 
a3: conditionally — 250,000 — 100,000 0 0 
lease 





is willing to approximate this expected value by one of four values: 500,000 barrels. 
200,000 barrels, 50,000 barrels, and no barrels (dry). Thus the states of nature become 
these possible values of the expected value of the random variable, oil yield. 

To measure the consequences of a decision maker’s action, we assume that there 
exists a loss function l(a, 6) that reflects the loss from taking action a when the state 
of nature is 6; it is defined for each combination of a and 0. If the problem is 
formulated in terms of gains, a gain can be termed as a negative loss. The loss function 
is generally measured in monetary terms, although other utility functions can be used. 
Note that L(a, 6) is assumed to be a function only of a and 0. The loss function for 
the oil-drilling example is easily obtained from Table 22.1 and is given in Table 22.2.1 
Although the loss function is easily obtained directly from the action and the state of 
nature in this example, occasionally the loss depends upon the outcome of a random 
variable whose probability distribution depends upon the true state of nature. For 
example, this situation would occur in the oil-drilling example if the profit were 
expressed directly in terms of the random variable, oil yield. The loss would then be 
a random variable, and /(a, 0) would then be interpreted as the expected value of the 
loss incurred when action a is taken and the true state of nature is 0. Hence even here 
the loss function depends upon only a and 8. In general, in formulating the problem, 
if the state of nature is defined so broadly that observing its value resolves all uncer- 
tainty relevant to the decision at hand, then the loss can always be expressed as a 
(deterministic) function of 6 and the action a. If this case is not true, meaning that 
the observation of 0 would still leave some uncertainty as to the ultimate consequence 
of a given action a, the loss function L(a, 0) is computed as the expected loss, given 
state @ and action a. 


Minimax Criterion 

If the true state of nature were known, it would be simple to choose the correct action, 
i.e., that action which has minimum loss. Unfortunately, the true state of nature is 
not generally known, and choosing a correct action is not simple. In the oil-drilling 
example, if 6 = 6,, a 500,000-barrel well, the best action is to drill for oil; whereas 
if 0 = 6,, a dry well, the best action is to lease unconditionally. This decision-theory 
formulation has the appearance of game theory as described in detail in Chap. 12, 
with the two players being the decision maker and nature. The actions correspond to 
the pure strategies of the decision maker, and the states of nature correspond to the 
pure strategies of nature. The payoff matrix in game theory is analogous to the loss 


1 In discussing this example throughout the chapter, many negative values appear. These are to be inter- 
preted as gains or profit and should not cause the reader any trouble. 


table. An approach for obtaining solutions to game theory problems is through the 
minimax principle. This principle tells the decision maker to find the maximum loss 
for each of his actions and to choose that action which has the smallest maximum 
loss. Similarly, the decision maker’s opponent, nature in this case, should find the 
minimum loss to the decision maker for each one of her possible states of nature and 
present to the decision maker that state of nature which maximizes this minimum loss. 
If these loss values are equal, the game is said to have a value. If a game has a value, 
and if each player follows his optimal strategy, the decision maker can guarantee that 
his loss will never exceed the value. Furthermore, if the decision maker follows his 
optimal strategy and if nature deviates from hers, the loss to the decision maker can 
only be decreased. Unfortunately, in this context a value does not always exist. 
However, it does exist in the oil-drilling example. Using the minimax criterion, the 
decision maker should choose action a, and guarantee that his loss will not exceed 
— 45,000. Similarly, nature should choose state 0 = 0, or 0 = 0, and guarantee that 
the decision maker’s loss will be at least — 45,000. Thus this ‘‘game’’ does indeed 
have a value, and the minimax strategy for the decision maker is to lease uncondi- 
tionally. 

A fundamental theorem in the theory of games states that if mixed strategies 
are allowed, and if the minimax principle is followed, the game always has a value. 
A mixed strategy for the decision maker is a probability distribution defined over the 
action space. The actual choice of strategy is dependent upon the outcome of a random 
device having a probability distribution associated with the action space. Thus choos- 
ing a mixed strategy is equivalent to choosing a probability distribution. Similarly, a 
mixed strategy for nature is a probability distribution defined over the possible states 
of nature. Pure strategies are just special cases of mixed strategies, where the prob- 
ability assigned to the chosen action is 1 and the probability assigned to the other 
action is zero. Because both the action and the state of nature are random variables, 
the loss incurred is also a random variable, and again expected loss is the criterion. 

However, even though the minimax principle has some attractive properties, it 
is seldom used in games against nature because it is an extremely conservative criterion 
in this context. The actions taken when this principle is used assume that nature is a 
conscious opponent that wants to inflict as much damage as possible on the decision 
maker. Generally, nature is not a malevolent opponent, and it is unlikely that the 
decision maker has to guard against such an occurrence. 


Bayes’ Criterion 


The previous section pointed out that the minimax principle says to proceed as if 
nature will select a probability distribution, defined over the possible states of nature, 
which is least favorable to the decision maker. It was also noted that this approach is 
very conservative because there is no reason to expect nature to use this distribution. 
As a matter of fact, in some situations the decision maker will actually have some 
advance information about 0 that contradicts this assumption about what nature will 
do. When the decision maker has such information, he certainly should take it into 
account. Such information can usually be translated into a probability distribution, 
acting as though the state of nature is a random variable, in which case this distribution 
is referred to as a prior distribution. Prior distributions are often subjective in that 
they may depend upon the experience or intuition of an individual. 

For example, in the oil-drilling problem, the company has had some experience 
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with wells in similar geographic areas and has concluded that about 10 percent of the 
strikes are 500,000-barrel wells, 15 percent are 200,000-barrel wells, 25 percent are 
50,000-barrel wells, and 50 percent are dry wells. Hence these data can be translated 
into the prior distribution as follows: 


P{0 = 0} = P,(1) = 0.10 
P{0 = 6} = P,(2) = 0.15 
P{@ = 6} = P3) = 0.25 
P{0 = 0, = P,(4) = 0.50. 





A procedure for using the prior distribution to aid in the selection of an action 
is Bayes’ criterion. Bayes’ principle tells the decision maker to select that action 
(called Bayes’ decision procedure) which minimizes the expected loss. The expected 
loss [(a) is evaluated with respect to the prior distribution, which is defined over the 
possible states of nature; i.e., 


SY Ka, Ph, if 0 is discrete 
Ka) = Ell(a, O] = 4 o 
ix l(a, y)Po(y) dy, if @ is continuous. 
Thus, for the oil-drilling example, the expected loss [(a) for each action is given by 


Kai) = Ella, O] = —650,000(0.10) — 200,000(0.15) + 25,000(0.25) 
+ 75,000(0.50) 


= — $51,250, 


~ 45,000(0.10) — 45,000(0.15) — 45,000(0.25) 
— 45 ,000(0.50) 


— $45,000, 
— 250,000(0.10) — 100,000(0.15) 


Ka) = Elka, )] 


Ka) = Ella, O] 
= — $40,000. 


Hence using Bayes’ principle leads to selecting action a,, that is, drill for oil, and the 
associated expected loss is — $51,250 (profit). It is interesting to speculate as to 
whether the decision maker could have improved upon this expected loss by making 
use of a mixed strategy rather than a pure strategy (because nature is using the mixed 
strategy specified by the prior distribution). It can be shown that the decision maker 
cannot improve his position by using mixed strategies, so that it is sufficient for him 
to consider only pure strategies. 


22.3 Decision Making with Experimentation 


The previous sections assumed that the decision maker was to make his decision 
without experimentation. However, if some experimentation is possible (perhaps at a 
cost), the data derived from this experimentation should be incorporated into the 
decision-making process. For example, returning to the oil-drilling example, suppose 
that it is possible to obtain seismic soundings at a cost of $12,000. This information 


Table 22.3 Frequency of Seismic Classifications 


Seismic 6,: 500,000- | 6: 200,000- 83: 50,000- 
Classification Barrel Well Barrel Well Barrel Well 


762) 9G) 11G3) 





4G) 3%) 6) 
1) 2%) 3) 
0G) 2%) 4G) 


BW Ne 





leads to four possible seismic classifications, denoted by (1), (2), (3), and (4). Clas- 
sification (1) denotes that there is definitely a closed geologic structure to the site (a 
very favorable condition if the presence of oil is desired); classification (2) denotes 
that there is probably a closed structure to the site; classification (3) denotes that there 
is a nonclosed structure to the site (a relatively unfavorable condition); and classifi- 
cation (4) denotes that there is no structure to the site (an unfavorable condition). 
Based upon past examination of similar geologic areas (100 such examinations), the 
company obtains the data presented in Table 22.3.' The values in parentheses in each 
cell can be interpreted as conditional probabilities, given the state of nature: e.g., if 
the well is a 200,000-barrel well, then 7% can be interpreted as the conditional prob- 
ability that the seismic reading is classified as (2) (probably a closed structure to the 
site); if the well is dry, then 38 can be interpreted as the conditional probability that 
the seismic reading is classified as (3) (a nonclosed structure to the site); and so on. 
Before proceeding with the example, we shall discuss a general method for incorpo- 
rating these data. 

Let X denote the information made available by experimentation obtained from 
a random sample. X is then a random variable and may be viewed as a function of 
the sample data; for example, X may denote a sample mean, the maximum of the 
sample, a vector of the sample observations, the third observation in a sample, and 
so forth. The decision maker is to choose a decision procedure, rule, or strategy, 
which tells him the form and amount of experimentation and what action to take for 
each possible value that X may take on. Denote this function to be chosen as d[x], so 
that if the random variable X takes on the value x, then a = d[x] would be the action 
to be taken. The decision maker, then, is interested in choosing a function d, from 
among the many possible decision functions, that is, in some sense, optimal. (Indeed, 
part of the problem here is to choose a good working definition for the term optimal.) 
To evaluate a decision function, we must explore its consequences. Because the action 
taken, a, is a function of the outcome of the random variable X, then d[X] is also a 
random variable, and the loss associated with that action also depends upon the out- 
come of this random variable. An appropriate measure of the consequences of taking 
action a = d[X], when the true state of nature is @, is then given by the expected 
value of the loss. This quantity will be known as the risk function R(d, 6); that is, 


Rid, 0) = Ell(d[X], @)], 


where the expectation is taken with respect to the probability distribution of the random 
variable X, and the loss function includes the cost of experimentation. 


' Although the actual fraction of wells falling historically into the four categories differs slightly from the 
prior distribution, the prior probabilities given under Bayes’ criterion in Sec. 22.2 are thought to be more 
representative of what to expect for this particular site and will be used subsequently. 
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834 Now consider how to apply this approach to the oil-drilling example. Suppose 

Probabilistic Models the following decision rule, d,, is to be evaluated. If the seismic reading is classified 
as (1), take action aj; if the seismic reading is classified as (2) or (3), take action a3; 
and if the seismic reading is classified as (4), take action a,; that is, 


d,[x] = a,, forx = 1 
= dy, forx = 4 
= az, forx = 2orx = 3. 
Therefore, 
R(d;, 0) = — 650,0002) — 45,0000) — 250.000G% + qs) + 12,000 
= — $471,333. 
R(d,, 92) = ~200,000(7%) — 45.0006) — 100,0006 + ye) + 12,000 
= —$137,375, 
R(d,, 03) = 25,0004) — 45,0006) + 0 + 3x) + 12,000 
= $15,958, 
R(d,, 94) = 75,000) — 45,0008) + OGS + 48) + 12.000 


$15,750. 


Note that the 12,000 represents the cost of obtaining the seismic data. 

Thus it is evident how the risk function for a given decision procedure is eval- 
uated. The risk function provides a means for defining optimality. An optimal decision 
function might be defined as one that will minimize the risk for every value of @. 
However, it is evident that an optimal decision function (in this sense) may not always 
exist and, in fact, does not exist in most cases. Thus the preceding definition is 
inadequate. Hence another definition of optimality is considered in the next section. 


Bayes’ Procedures 


Even when data are available, there is no best definition of optimal procedures. With 
data, it is still possible to use a minimax criterion or a minimax decision function, 
but it too suffers from the same disadvantages as it does when no data are available; 
i.e., it assumes that nature will act as a conscious opponent and confront the decision 
maker with the least favorable distribution of 8. 

If the decision maker has some advance information about the states of nature 
that can be described in terms of a prior distribution, then Bayes’ principle can be 
applied to the risk function. If the states of nature are discrete, Bayes’ risk corre- 
sponding to a decision function d and a prior probability distribution of 6, P,({k), is 
given by 

Bid) = >, R(d, PÑ). 
all k 
If the states of nature are continuous, Bayes’ risk corresponding to a prior probability 
density function of 0, P,(y), is given by 


B(d) = Í, R(d, YP) dy. 


Bayes’ risk provides another means for defining optimality for decision rules using 
Bayes’ principle. Bayes’ principle tells the decision maker to select that function d 
(called the Bayes’ decision procedure) which minimizes B(d). A method for finding 
Bayes’ decision procedure follows. 

When no data were available, using Bayes’ procedure led us to select that action 
which minimized the expected loss; this expectation was evaluated with respect to the 
prior distribution of 6. Now that data are available, additional information is available 
about the state of nature. For example, if the seismic data are classified as (4), they 
are evidence that the strike will not be a 500,000-barrel well and probably not a 
200,000-barrel well. Hence, after observing the experimental data, we should update 
the prior distribution by using more timely information about the probability distri- 
bution of the state of nature. Such updated information is called the posterior distri- 
bution of 0, given the prior distribution and the data X = x. The posterior distribution 
of 6 is just the conditional distribution of 6, given X = x. If 0 is discrete, the posterior 
distribution will be denoted by Agy..(k), and, if 0 is continuous, then the posterior 
distribution will be denoted by hgjy—,(y). The method for calculating the posterior 
distribution is given later. However, if the method used for calculating Bayes’ pro- 
cedure when no data are available is followed (selecting that action which minimizes 
the expected loss), with this expectation now evaluated with respect to the posterior 
distribution of 0, given X = x, this decision procedure minimizes B(d). Hence it is 
Bayes’ procedure. This statement is not obvious, but it can easily be proved. Thus, 
to find Bayes’ procedure, the decision maker computes the posterior distribution of 
6, given X = x. He then chooses that action which minimizes the expected loss! J,(a) 
(including the cost of experimentation),” with this expectation evaluated with respect 
to the posterior distribution of 6, given X = x, where 


> Ka, hox- $), if @ is discrete 
all k 


lKa) = Ella, O] = 4 
Í 8 Ka, yhox=.(y) dy, if 0 is continuous. 


In the oil-drilling example, the posterior distribution can be calculated by methods to 
be discussed later in this section and is given in Table 22.4. 

Suppose that the seismic reading is classified as (3) (the geologic site has a 
nonclosed structure). To obtain Bayes’ procedure, we compute the expected loss with 
respect to the posterior distribution of 0, given X = 3, for each of the actions as 


Table 22.4 Values of hay -,(k), 
the Posterior Distribution of 0 








1 Note that loss is used rather than risk. 


? Although the notation J,(@) does not show it, one should remember that this quantity does indeed depend 
upon the experimental outcome, x. 
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follows: 


l(a) = Efka, ©] = —650,000(0.039) — 200,000(0.087) + 25,0000. 146) 
+ 75,000(0.728) + 12,000 


$27,500, 
— 45,000 + 12,000 = —$33,000, 


L(a) = Ella, 9] 
l (a3) = Elaz, 0)] 


l 


—250,000(0.039) — 100,000(0.087) + 12,000 
— $6,450. 


Bayes’ procedure selects action a, (because this action minimizes the expected loss), 
which implies that the company should unconditionally lease the land. Thus we see 
that the experimental data change the action of the decision maker. Without experi- 
mentation, using Bayes’ procedure recommended drilling for oil, whereas the infor- 
mation obtained from the seismic data recommends that the company. unconditionally 
lease the land. Incidentally, although Table 22.4 was obtained for all values of x, it 
was necessary to obtain the values for only x = x,. In fact, this method of computing 
Bayes’ procedure has the important advantage that it is necessary to compute only the 
optimal d[x] for the single point that corresponds to the outcome of the experiment. 
Using the basic formula for B(d) to find Bayes’ procedure requires the determination 
of the entire optimal decision function, which is generally more difficult. 


Calculation of the Posterior Distribution 


Denote by (0, X) a bivariate random variable having a joint probability distribution. 
Consider the case where (0, X) is a discrete bivariate random variable, with joint 
probability distribution given by Pgy(k, J). Each random variable 0 and X has a 
marginal distribution. In fact, P,(k), the prior distribution of 6, is the marginal dis- 
tribution of 6. The usual expression given as the probability distribution of the random 
variable X actually corresponds to the conditional probability distribution of X, given 
0. For example, if X has a Poisson distribution with parameter 6 = 24, then e~*424//j! 
is just the conditional probability distribution function of X, given 0 = 24.! To indicate 
that this distribution is a conditional distribution, we introduce the notation 


Qx) = PIX = j|@ = Kh. 
Thus, in the Poisson example, 


e 494i 
j! 








Qxje=24()) = P{X = j|0 = 24} = 
represents the conditional probability distribution of X, given that 0 = 24, and it has 


the form of a Poisson distribution with parameter 0 =. 24. 
If the joint distribution of (@, X) is of interest, the expression 


Poxtk, j) = Oxjo=c DP fk) 


' In order to be consistent with the notation introduced in this chapter, 6 is used instead of the more usual 
symbol A to denote the parameter of the Poisson distribution. 


can be used to evaluate it. Q,(/), the marginal distribution of X, can also be obtained; 
ié., 


OD = X Palk j) = > Quo- DP (bd. 
all k all k 


Finally, the only remaining probability distribution that has not been discussed 
is the conditional distribution of 6, given X = j, that is, the posterior distribution of 
0, given X = j, Agy_;(k). An alternative expression for the joint probability distri- 
bution of (0, X) is given by 


Poxtk, j) = No\x-;WOx())- 
Equating the two expressions for P,py(k, j), and letting j = x (the outcome of the 


experiment), leads to the important result from which the posterior distribution can 
be calculated; i.e., 


hoxaak) a Qx o=x@)P olk) 
Ox(x) 

Thus, in summary, the posterior distribution can be calculated by using the 
preceding expression. P,(x) is the prior distribution. Qy)9— (x) is the ordinary expres- 
sion for the probability distribution of the random variable X evaluated at X = x, but 
it is written in this form to show its dependence upon the value of the parameter 6. 
The function Q,(x) is the marginal distribution of the random variable X, evaluated 
at X = x, and is obtained from 


Ox) = >) Oyjgne2)Po(k)- 
all k 


Returning to the oil-drilling example, suppose that the seismic reading is classi- 
fied as (3) (the geologic site has a nonclosed structure). Recall that the prior distri- 
bution of the classification of the land is assumed to be 


P{0 = 0} = P,(1) = 0.10 
P{0 = 0} = P,(2) = 0.15 
P{90 = 0} = P,(3) = 0.25 
P{0 = 0} = P4) = 0.50. 








It is necessary to evaluate the expressions Agy—3(1), Agy—3(2), Agx—3(3), and 
Ng\x=3(4), where 


hox=3(k) = g os fork = 1, 2, 3, and 4. 


In this case, Qyg—.(3) = P{X = 3|@ = k} is just the probability that the seismic 
reading will be classified as (3), given that the well is a 6,-barrel well, and. these 
values can be obtained directly from Table 22.3. Hence 


Ox} 9= 103) = Tr, Ox 9=2(3) = Š, Qxja=3(3) = 3, Oy 9=4(3) ae is. 


The marginal distribution of the random variable X evaluated at X = 3 can now be 
obtained; that is, 


Ox3) = Qyjo2i1G)Po1) + Oxjo-23)Po2) + Oxjo=33)PA3) + Oxjo43)Po(4) 
= 75(0.1) + (0.15) + §(0.25) + 73(0.50) = 0.2147, 


i 
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so that the posterior distribution can now be calculated; i.e., 





hgx=3(1) = Go) = 0.039 
hg\x=3(2) = ee = 0.087 
hoxa33) = 02%) = 0.146 
hax=3(4) = Sao. = 0.728. 


There exists a simple tabular algorithm that will yield Table 22.4, i.e, the 
posterior distribution of 0. Table 22.5 is such a table. The entries in the top row of 
Table 22.5a are simply the prior probabilities of the well sizes, whereas the entries 
in the bottom four rows are the conditional probabilities of a seismic classification 
given the size of a well. The entries in the first four columns of Table 22.5b are the 
entries of the corresponding elements of Table 22.5a, multiplied by the appropriate 
prior probability [e.g., 0.0583 = (5)(0.10)]. The entries in the last column of Table 
22.5b are simply the sum of the four elements in the corresponding row (e.g., 0.3511 
= 0.0583 + 0.0844 + 0.1146 + 0.0938). The entries in Table 22.5c, the conditional 
probabilities of obtaining a particular well size given a seismic classification, are the 
entries of the corresponding elements of Table 22.5b, divided’ by the appropriate 
element Q,(x), in the last column of Table 22.5b (e.g., 0.166 = 0.0583/0.3511). 

The expression for the posterior distribution has been given, where 0 and X are 
both discrete random variables. If @ is discrete and X is continuous, the posterior 
distribution Hg) — (k) is given by 


Fxjo=cdx)P fk) 
lie) a e e, 
olx x re x) 
where P,(k) is the prior distribution of 6, and f x|e=4(4) is the ordinary expression for 
the density function of the random variable X, but it is written in this form to show 
its dependence upon the value of the parameter 6. The function fy(x) is the marginal 


Table 22.5 Tabular Algorithm for Computation of Posterior Distribution —Oil-Drilling Example 









































(a) (b) (c) 
Prior Dist. ‘ pon aid we 
Posterior Distribution of 0 
Pak) | 0.10 A OEA AA tne 
4 
Qx- 0) Qxjo= P K) Qx hgx- (K) 
k k k 
x 6 | & | a | & | x 6, 6, 6, 6, x 6, 6, | 6 6, 
a || % |e) RH] a = 1 | 0.0583 | 0.0844 | 0.1146 | 0.0938 | 0.3511 = } [0-166 | 0.240 | 0.327 | 0.267 
© © ° 
o8 DE AE AE o5 2 | 0.0333 | 0.0281 | 0.0625 | 0.1354 | 0.2593 ok 2 |0.129 | 0.108 | 0.241 | 0.522 
E 3| a& slala] Ba 3 | 0.0083 | 0.0188 | 0.0313 | 0.1563 | 0.2147 | 33 3 10.039 | 0.087 | 0.146 | 0.728 
AS ajg AS 41}0.0  |0.0188] 0.0417) 0.1146|0.1751| 2S 4 |o.0 |0.10710.238| 0.655 


density of the random variable X, and it is obtained from 
fx) = a Fxjo=e Polk). 


If both 0 and X are continuous, the posterior distribution hg), ,(y) is given by 


Frjo=y@)P oly) 
Fx) 


where P,( y) is the prior density function of 0. The function fx;g_,(x) has been defined 
previously. f(x) is the marginal density of the random variable X, and it is obtained 
from 


hex =x(¥) = 


> 


Fx) = | _ Fxje=y@)Po(y) dy. 
Finally, if @ is continuous and X is discrete, the posterior distribution hgy— .(y) 
is given by 
Qy o=@Po (y) 
Qy(x) : 


where P,(y) is the prior density function of 6. The function Qy),_,(x) is the ordinary 
expression for the probability distribution of the random variable X, that is, 
P{X = x|0 = y}, but it is written in this form to show its dependence upon the value 
of the parameter 6. Q,(x) is the marginal distribution of the random variable X, and 
it is obtained from 


hojx=4¥) = 


Oxa) = | _, Oxjons(2)P oly) dy. 


Value of Experimentation 


Before performing any experiment, we should determine its potential value. Suppose 
the experiment can lead to perfect information about the state of nature. What is this 
perfect information worth? In the oil-drilling example, seismic information that is 
imperfect costs $12,000. If perfect information saves, say, only $10,000, seismic 
information should be forgone because it is too expensive. If we knew that the strike 
would be a 500,000-barrel well (state of nature is 6,), then the best action to take 
clearly would be a,, drill for oil. We can see from Table 22.2 that this action would 
lead to a loss of — $650,000. Similarly, if we knew that the strike would be a 200,000- 
barrel well (state of nature is 6,), the best action again would be a,, with a corre- 
sponding loss of — $200,000. However, if we knew that the strike would be a 50,000- 
barrel well (state of nature is 6;), the best action would be a, unconditionally lease 
the land, with a corresponding loss of — $45,000. Finally, if we knew that the well 
was dry (state of nature is 0,), the best action again would be a,, also with a corre- 
sponding loss of — $45,000. Because the (prior) probabilities of each of these states 
are known, the expected loss with perfect information available about the state of 
nature E(PI) is given by 


E(PI) = —650,000(0.1) — 200,000(0.15) — 45,000(0.25) — 45,000(0.50) 
= — $128,750. 


Bayes’ solution (without any data) provided for an expected loss of — $51,250, which 
is substantially more than the expected loss with perfect information, so that experi- 
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Table 22.6 Bayes’ Actions, Expected Losses, 
and Marginal Distribution for Oil Example 





Marginal 
Distribution 
x | Bayes’ Action | Expected Loss Q(x) 
1 a, — 115,700 0.351 
2 a, — 48,275 0.259 
3 ay — 33,000 0.215 
4 ay — 33,000 0.175 








mentation can lead to potential savings. In fact, the decision maker should be willing 
to pay a cost of up to —51,250 + 128,750 = $77,500 for perfect information. 

Now that experimentation may be desirable, it is useful to determine the value 
of obtaining seismic data. It has been shown that if the seismic data are classified as 
(3), the optimal (Bayes) action is a, with a corresponding loss of — $33,000. Using 
the same techniques, the optimal (Bayes) actions can be obtained if the seismic data 
are Classified as (1), (2), or (4). These results are summarized in Table 22.6. Because 
the entries for the expected loss depend upon the outcome of the experiment, x, the 
overall measure of the effectiveness of the experiment requires obtaining a weighted 
sum that is weighted with respect to the marginal probability distribution of the random 
variable, X, of the Bayes’ losses. The calculations required for obtaining this marginal 
distribution, Q,(x), are described earlier in the section dealing with the calculation of 
the posterior distribution, where we obtained Q,(3). Other values are given in Table 
22.6. Hence the weighted sum of the Bayes’ losses, called the unconditional expected 
loss with experimentation, is given by 


—115,700(0.351) — 48,275(0.259) — 33,000(0.215) — 33,000(0.175) = — $65,984. 


The value of the experiment (beyond its cost of $12,000) is then given by the differ- 
ence between this weighted sum and the Bayes’ loss without data; i.e., 


— 65,984 + 51,250 = —$14,734, 


indicating an expected savings of $14,734 due to following an optimal decision pro- 
cedure with experimentation. Hence obtaining seismic soundings does reduce the total 
expected cost. 

In fact, if the nonoptimal procedure, d,, presented earlier in this section, is 
followed, its weighted risk is given by 


— 471,333(0.1) — 137,375(0.15) + 15,958(0.25) + 15,750(0.50) = —$55,875. 


Thus an expected savings of $10,109 (= — 65,984 + 55,875) is obtained by using 
the (optimal) Bayes’ procedure rather than the procedure dj. 


22.4 Decision Trees 


An alternative method to the analysis presented in this chapter is the use of decision 
trees. A decision tree is a graphical method of expressing, in chronological order, 
the alternative actions that are available to the decision maker and the choices deter- 
mined by chance. Decision trees consist of forks (nodes) and branches. There are two 
types of forks: decision forks, represented by squares, L]; and chance forks, repre- 


sented by circles, ©. Branches are straight lines that emanate from decision forks or 
chance forks. When a decision maker encounters a decision fork, she must choose 
one of the alternative branches to travel on. When a decision maker encounters a 
chance fork, she has no control over which branch to travel on. Instead, her path is 
determined by chance events whose probabilities are those associated with the 
branches that emanate from the chance fork. 

For example, the decision tree for the oil-drilling problem is given in Fig. 22.1, 
and you are urged to refer to this figure throughout the ensuing discussion. Initially 
the decision maker has a choice of not using seismic soundings or using seismic 
soundings. Either action triggers some consequences. If the decision not to use seismic 
soundings is made, the decision maker is led down the appropriate path, arriving at 
a fork (node) with branches marked: drill, unconditionally lease, and conditionally 
lease. She must choose one of these branches on which to continue. If she chooses 
to drill, she is led down the appropriate path, arriving at a fork with branches marked: 
500,000-barrel well, 200,000-barrel well, 50,000-barrel well, and dry well. The 
choice of the branch on which to continue is a chance event. 

Depending upon the outcome of this chance event, she reaches a terminating 
point. Similarly, if the initial decision is to use seismic soundings, the decision maker 
is led down the appropriate path, arriving at a fork with branches marked: definitely 
closed structure, probably closed structure, nonclosed structure, no structure. The 
choice of the branch on which to continue is a chance event. If by chance the data 
reveal a nonclosed structure, then this branch is chosen, and she arrives at a fork with 
branches marked: drill, unconditionally lease, and conditionally lease. The decision 
maker must choose one of these branches on which to continue. If she chooses to 
drill, she is led down the appropriate branch, arriving at a fork with branches marked: 
500,000-barrel well, 200,000-barrel well, 50,000-barrel well, and dry well. The 
choice of the branch on which to continue is again a chance event. Depending upon 
the outcome of this chance event, she reaches a terminating point. The entire tree can 
be completed in this fashion.! 

The previous discussion presents a graphical method for representing the deci- 
sion problem. However, nothing was said about how to choose the optimal path to 
travel on. Basically, the calculations described in the earlier sections of this chapter 
are required, namely, finding posterior probabilities, marginal probabilities, and 
Bayes’ risks. For each possible path that can be followed, the loss is specified at the 
terminal point. Working backward from each terminal point to the nearest fork (a 
chance fork), we place a loss at that fork, this cost being the expected cost taken with 
respect to the probabilities associated with the branches. These probabilities represent 
the probability of a certain state of nature, indicated by the terminal branch, being 
chosen, given the path followed to the last fork. For example, for the no-seismic data 
drill path, the probability of the 500,000-barrel branch is just the prior probability that 
the well is a 500,000-barrel well—that is, 0.1. For the seismic data, nonclosed struc- 
ture drill path, the probability of the 500,000-barrel branch is just the posterior prob- 
ability that the well is a 500,000-barrel well, given that the seismic reading was 
classified as a nonclosed structure—that is, 0.039. Again, working backward, we find 
that the next fork is a decision fork. The loss associated with this fork is the minimum 
loss over the branches associated with that node. On the no-seismic data path, 
— 51,250 represents the minimum of —51,250, —45,000, and — 40,000 and is as- 


1 Only part of the tree is presented in Fig. 22.1, but all other branches can be easily drawn. 
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Figure 22.1 Decision tree for oil-drilling example. 


sociated with the action drill. Hence, this action is the best one to take, given that 
the decision maker is at that fork. The symbol (Xx) through the other two branches 
eliminates the actions unconditionally lease and conditionally lease from further con- 
sideration on that path. Similarly, on the seismic data, nonclosed structure path, 
— 45,000 represents the minimum of 15,500, —45,000, and — 18,450 and is asso- 


ciated with the action unconditionally lease. Hence, this action is the best one to take, 
given that the decision maker is at that fork. The symbol (X ) through the other two 
branches eliminates the other two actions from further consideration on that path. The 
next fork on the seismic data path is a chance fork. The loss associated with this fork 
is the expected cost taken with respect to the probabilities associated with the branches. 
These probabilities represent the (unconditional) probability that the seismic reading, 
indicated by the branch, is obtained, given the path followed to this fork. For the 
seismic data path, the probability that the seismic reading will lead to the nonclosed 
structure branch is just the unconditional probability that it will be classified as a 
nonclosed structure—that 1s, 0.215. Finally, the beginning fork has a loss associated 
with it of — 65,984. This loss is the minimum of — 51,250 that is associated with the 
no-seismic data branch and — 65,984, which is associated with the seismic data branch 
(and obtained by adding the cost of taking seismic soundings, 12,000, to the ~ 77,984 
attached to the branch). Note that the cost of taking seismic soundings is denoted by 
the symbol A on the branch. Hence the no-seismic data branch is eliminated, and the 
optimal procedure is to follow the seismic data path, leading to an expected profit of 
65,984, which is, of course, the solution obtained earlier. Again, it is worthwhile to 
note that the calculations required by using decision tree analysis are identical to those 
required by using the previously described analytical methods. 


22.5 Utility Function 


The oil-drilling example assumed that an expected loss (profit) in monetary terms was 
the appropriate measure of the consequences of taking an action, given a state of 
nature. However, there are many situations where this assumption is inappropriate. 
For example, suppose that an individual was offered the choice of (1) accepting a 
50-50 chance of winning $10,000 or nothing or (2) receiving $4,000 with certainty. 
Many people would prefer the $4,000 even though the expected payoff on the 50-50 
chance of winning $10,000 is $5,000. A company may be unwilling to invest a large 
sum of money in a new product even if the expected profit is substantial if there is a 
risk of losing their investment and thereby becoming bankrupt. People buy insurance 
even though it is a poor investment; the insurance company must pay expenses and 
make a profit. Do these examples invalidate the previous material? Fortunately, the 
answer is no, because there is a way of transforming monetary values into an appro- 
priate scale that reflects the decision makers’ preferences. This scale is called the 
utility scale, and it becomes the appropriate measure of the consequences of taking 
an action, given a state of nature. A detailed discussion of utilities can be found in 
the references at the end of this chapter. 


22.6 Carnival Example 


A carnival is scheduled to appear in a city on a given date. The profits that will be 
obtained depend heavily upon the weather. In particular, if the weather is rainy, the 
carnival loses $15,000; if it is cloudy, the carnival loses $5,000; and if it is sunny, 
the carnival makes a profit of $10,000. The carnival has to set up equipment for its 
show, but it can cancel the show prior to setting up its equipment. This action results 
in a loss of $1,000. Furthermore, by incurring an additional cost of $1,000, the 
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Table 22.7 Weather Bureau. Data 
for Carnival Example 








Actual Weather 
0.7 0.2 


0.2 0.6 
0.1 0.2 





Probability that 
Forecast Is 


Rain (RF) 
Clouds (CF) 
Sun (SF) 









0.7 





carnival can postpone its setup decision until the day before the scheduled per- 
formance. At this time, the carnival can obtain the local weather report. The Weather 
Bureau has compiled data based upon its predictions; these data are given in Table 
22.7. Furthermore; the Weather Bureat. has compiled a prior distribution of the 
weather. In particular, the probabilities of rain, clouds, and sun are 0.1, 0.3, and 0.6, 
respectively. 

We shall analyze this example by first using decision tree analysis. The un- 
evaluated decision tree is shown in Fig. 22.2, and it is a graphical representation of 
the decision problem.' Note that the lower part of the tree represents the no-data case, 
while the upper part uses additional information from experimentation. The first de- 
cision fork to confront the decision maker requires making a choice between using or 
not using the Weather Bureau’s report (fork 1). If the choice is not to use the Weather 
Bureau’s information, the decision maker is led down the path that arrives at decision 
fork 2 with branches marked setup and no setup (cancel). The selection of one of 
these branches results in a flow into a chance fork (either fork 3 or fork 4). The choice 
of the branch on which to continue is a chance event, and, depending upon the outcome 
of this chance event, the decision maker reaches one of the terminating points: rain, 
clouds, or sun. Associated with each of these terminating points is a monetary 
consequence. 

If, at initial decision fork 1, the decision maker chooses to use the Weather 
Bureau’s information, he is led down the path that arrives at fork 5, with branches 
marked: weather forecast rain, weather forecast cloudy, weather forecast sunny. The 
choice of the branch on which to continue is a chance event. If (by chance) the Weather 
Bureau forecasts a sunny day, then the decision maker chooses this branch, and it 
leads to decision fork 11, which has subsequent branches identical to those described 
for fork 2. 

This discussion has been concerned with a graphical method for representing 
the decision problem, and it has not been concerned with how to choose the optimal 
path to travel on. Determining the optimal path requires calculations similar to those 
described for the oil-drilling example. In particular, the posterior distribution of the 
states of nature (rain, clouds, or sun) is required, given the weather forecast. This 
posterior distribution is given in Table 22.7, and the details will be presented later in 
this section. These posterior probabilities are necessary for the evaluation of the de- 
cision tree. 

Figure 22.3 is the evaluated decision tree for the carnival example. For each 
possible path that can be followed, the loss is specified at the terminal point. If we 
work backward from each terminal point to the nearest fork (a chance fork), the loss 
placed at that fork is the expected cost taken with respect to the probabilities associated 


1 The forks are numbered arbitrarily for discussion purposes. 
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Figure 22.2 Unevaluated decision tree for carnival example. 
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with the branches. These probabilities represent the probability of the state of nature 
(indicated by the terminal branch) being chosen, given the path followed to the last 
fork. For example, for the Do not utilize Weather Bureau report, Setup path, the 
probability of reaching the sun branch is just the prior probability that a day will be 
sunny—that is, 0.6 (this probability is shown on the branch), The expected cost at 
fork 3 is — $3,000 (profit). For the Utilize Weather Bureau report, Weather forecast 
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Figure 22.3 Evaluated decision tree for carnival example. 


sunny, Setup path, the probability of reaching the sun branch is just the posterior 
probability that a day will be sunny, given that the weather forecast is for a sunny 
day—that is, 0.857 (this posterior probability is shown oñ the branch). The expected 
cost at fork 12 is — $7,655 (profit). Working backward from fork 12 leads to fork 
11, a decision fork. Fork 14 also leads into fork 11. The loss associated with fork 11 
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is the minimum loss over the two branches that emanated from that fork. The loss of 
— $7,655 represents the minimum of — $7,655 and $1,000, and it is associated with 
the action Setup. Hence this action is the best one to take, given that the decision 
maker is at fork 11. The symbol (X ) through the other branch eliminates that branch 
from further consideration on that path. Again, working backward from fork 11 leads 
to fork 5, a chance fork. The loss associated with this fork is the expected cost taken 
with respect to the probabilities associated with the branches. These probabilities 
represent the (unconditional) probability that the Weather Bureau’s forecast, indicated 
by the branch, is obtained, given the path followed to this fork. For the Utilize Weather 
Bureau report path, the probability that the weather forecast will be sunny is simply 
the unconditional probability that the forecast will call for sun—that is, 0.49. This 
probability appears on the appropriate branch in Fig. 22.3, and its calculation will be 
discussed later in this section. ! The expected cost at fork 5 is — $3,563 [—7,655(0.49) 
+ ~5(0.32) + 1,000(0.19)]. Continuing to work backward from fork 5 leads to 
fork 1, the initial decision fork. Fork 2 also leads into this initial decision fork. The 
loss associated with fork 1 is the minimum loss over the (two) branches that emanate 
from that fork. The loss of — $3,000 represents the minimum of —3,000 and — 2,563 
(after adding the $1,000 cost of using the Weather Bureau’s forecast), and it is as- 
sociated with the action that calls for not using the Weather Bureau’s forecast. Indeed, 
the optimal decision is not to use the Weather Bureau’s forecast and to set up for the 
carnival, leading to an expected profit of $3,000. 

Although incurring a cost of $1,000 to use the Weather Bureau’s report was not 
worthwhile, a cost of $563 or less would be worthwhile. Finally, what is the most 
that the carnival should be willing to pay for any type of information about the 
weather? The expected loss with perfect information is given by 


E(PI) = 1,000(0.1) + 1,000(0.3) — 10,000(0.6) = — $5,600, 


so that some type of experimentation may lead to potential savings, with a decision 
maker being willing to pay a cost for perfect information up to 


—3,000 + 5,600 = $2,600. 


Calculation of the Posterior Distribution for the Carnival Example 


To calculate the posterior distribution, we need to use the same tabular algorithm that 
was presented for the oil-drilling example. The results are given in Table 22.8. 

The entries in the top row of Table 22.8a are simply the prior probabilities of 
the weather conditions, whereas the entries in the bottom three rows are the conditional 
probabilities of a forecast type, given a particular weather condition. The entries in 
the first three columns of Table 22.8 are the entries of the corresponding elements 
of Table 22.8a multiplied by the appropriate prior probability [e.g., 0.07 = 
(0.70)(0.10)]. The entries in the last column of Table 22.8b are simply the sum of 
the three elements in the corresponding row (e.g., 0.19 = 0.07 + 0.06 + 0.06). 
The entries in Table 22.8c, the conditional probabilities of obtaining a particular 
weather condition given a particular forecast type, are the entries of the corresponding 
elements of Table 22.85 divided by the appropriate element, Q(x), in the last column 
of Table 22.85 (e.g., 0.368 = 0.07/0.19). 


' These probabilities are also obtained from the tabular algorithm for the computation of the posterior 
distribution (the last column of Table 22.85). 
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Table 22.8 Tabular Algorithm for Computation of Posterior Distribution for the Carnival Example 
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(SF) 
































Non-Decision Tree Analysis of the Carnival Example 


An alternative (but equivalent) method for solving the carnival example is to use the 
techniques given in Secs. 22.2 and 22.3. There are two potential actions (other than 
deciding whether or not to use the Weather Bureau’s data), namely, setup (a,) and 
no setup (a). There are three states of nature: rain (R), clouds (C), and sun (S). The 
loss function is presented in Table 22.9. The problem will be solved by first assuming 
the local Weather Bureau’s report is not available (no data). The expected loss L(a) 
for each action is given by 


Ia) = 15,000(0.1) + 5,000(0.3) — 10,000(0.6) = — $3,000, 


Kaz) = 


Hence Bayes’ principle leads to selecting action a,, that is; setting up, and the asso- 
ciated expected loss is — $3,000 (profit). 

Now, the Weather Bureau’s report will be assumed to be available but at a cost 
of $1,000. The posterior distribution can be obtained and is given in Table 22.8c. 
This table was obtained by using the tabular algorithm. Entries can also be obtained 
directly from the usual expressions for posterior probabilities. For example, the pos- 
terior distribution that the weather will be sunny, given that the forecast is for rain, 
is given by 


1,000(0.1) + 1,000(0.3) + 1,000(00.6) = $1,000. 


(0.1)(0.6) 
(0.7)(0.1) + (0.2)(0.3) + (0.1)(.6) 





= 0.316. 


The optimal Bayes’ actions, given the various forecasts, are presented in Table 
22.10. A typical entry in Table 22.10 will be obtained next. Given that the forecast 


Table 22.9 Loss Function 
for Carnival Example 








State of Nature 








ay: setup 
a: no setup 
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Table 22.10 Table of Bayes’ Actions, Expected Losses, and 
Marginal Distribution for Carnival Example 






















Marginal 
Forecast Bayes’ Action | Expected Loss | Distribution 
RF 2,000 0.19 
CF 995 0.32 
SF — 6,655 0.49 


is for rain, 


L(a) = (0.368)(15,000) + (0.316)(5,000) — (0.316)(10,000) + 1,000 


= $4,940, 
l(a) = $1,000 + 1,000 = $2,000. 


Hence the recommended Bayes’ action is a,, and the corresponding expected loss is 
$2,000. Similarly, the marginal probability that the forecast will be sun is given by 


(0.1)(0.1) + (0.2)(0.3) + (0.7)(0.6) = 0.49. 


This result is also obtained in the last column of Table 22.85. Hence, if the Weather 
Bureau’s reports are to be used, the Bayes’ actions call for setting up if the forecast 
is for clouds or sun but for not setting up if the forecast is for rain. 

Is it desirable to use the Weather Bureau’s data? The answer to this question 
calls for computing the unconditional expected loss with experimentation (weighted 
sum of the Bayes’ losses); i.e., 


2,000(0.19) + 995(0.32) — 6,655(0.49) = —2,563. 
Thus the value of the Weather Bureau’s data is given by 
—2,563 + 3,000 = 437, 


(which is an actual loss relative to the no-data case), so that the Weather Bureau’s 
data are not worth the $1,000 cost. 


22.7 Conclusions 


Decision analysis has become an important technique in the solution of business 
problems. It can be applied to broad problems facing management, such as determin- 
ing whether to enter a new product field, or it can be used to solve smaller problems, 
such as the one illustrated by the oil-drilling example. It is characterized by the 
decision maker enumerating all the available courses of action, expressing the utilities, 
and quantifying the subjective probabilities. When these data are available, decision 
analysis becomes a powerful tool in determining an optimal course of action. 
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PROBLEMS 


1.* A new type of airplane is to be purchased by the Air Force, and the number of 
spare engines to be ordered must be determined. The Air Force must order these spare engines 
in batches of five, and it can choose among only 15, 20, or 25 spares. The supplier of these 
engines has two plants, and the Air Force must make its decision prior to knowing which plant 
will be used. From past experience, the Air Force knows that the number of spare engines 
required when production takes place at Plant A is approximated by a Poisson distribution with 
parameter 0 = 21, whereas the number of spare engines required when production takes place 
at Plant B is approximated by a Poisson distribution with parameter 0 = 24. The cost of a 
spare engine purchased now is $400,000, whereas the cost of a spare engine purchased at a 
later date is $900,000. Holding costs and interest are to be neglected. Spares must always be 
supplied if they are demanded, and unused engines will be scrapped when the airplanes become 
obsolete. From these data, the loss function can be computed as 


State of 
g Nature 
Action 








0:6 = 21 b: 0 = 24 
a: order 15 1.155 x 107 | 1.414 x 107 
ay: order 20 1.012 x 10’ | 1.207 x 107 
az: order 25 1.047 x 10’ | 1.135 x 10 


The Air Force knows from past experience that two-thirds of all types of airplane engines 
are produced in Plant A, and only one-third are produced in Plant B. Furthermore, it knows 
that a similar type of engine was produced for an earlier version of the current airplane under 
consideration. The order size for this earlier type was the same as for the current model. 
Furthermore, its nonobsolete life is identical with that planned for the present version. The 
engine for the current order will be produced in the same plant as the previous model, although 
the Air Force does not know which of the two plants this is. The Air Force does have access 
to the data on the number of spares actually required for the older version (which had a Poisson 
distribution), but it does not have time to determine the production location. 

(a) What action does the Bayes’ procedure recommend, assuming that the information 

on the old airplane model is not available? 

(b) How much money is it worthwhile to pay for ‘‘perfect information’ °? 

(c) Assuming that cost of data on the old airplane model is free and that 30 spares were 

required, determine the Bayes’ action. 


2. A large mill is faced with the problem of extending $100,000 credit to a new cus- 
tomer, a dress manufacturer. The mill classifies typical companies into the following categories: 
poor risk, average risk, and good risk. Their experience indicates that 20 percent of similar 
companies are poor risks, 50 percent are average risks, and 30 percent are good risks. If credit 
is extended, the expected profit for poor risks is — $15,000, for average risks $10,000, and for 
good risks $20,000. If credit is not extended, the dress manufacturer will turn to another mill. 
The mill is able to consult a credit-rating organization for a fee of $2,000. Their experience 
with this credit-rating company is as follows: 





Credit Company Actual Credit Rating, KA 













Evaluation Average | Good 
Poor 40 20 
Average 50 40 





Good 10 





(a) What action does Bayes’ procedure recommend, assuming the credit-rating company 
is not used? 

(b) How much money is it worthwhile to pay for ‘‘perfect information’? 

(c) What is the optimal expected loss if the credit-rating company data are used? Does 
it pay to use these data? 

(d) What action does Bayes’ procedure recommend if the credit-rating company deter- 
mines the dress manufacturer to be a poor risk? 


3. Use the scenario given in Prob. 2. 

(a) Draw and properly label the decision tree. 

(b) Evaluate the decision tree. 

(c) Determine the optimal policy. 

(d) How much money is it worthwhile to pay for “‘perfect information’? 


4. A manufacturer produces items that have a probability p of being defective. These 
items are formed into lots of 150. Past experience indicates that p is either 0.05 or 0.25, and, 
furthermore, in 80 percent of the lots produced, p equals 0.05 (and, in 20 percent of the lots, 
p equals 0.25). These items are then used in an assembly, and ultimately their quality is 
determined before the final assembly leaves the plant. Initially the manufacturer can either 
screen each item in a lot at a cost of $15 per item and replace defective items or use the items 
directly without screening. If the latter action is chosen, the cost of rework is ultimately 
$100/defective item. For these data, the costs per lot can be calculated as follows: 












Screen 
Do not screen 


Because screening requires scheduling of inspectors and equipment, the decision to screen or 
not screen must be made 2 days before the potential screening takes place. However, one item 
can be taken from the lot and sent to a laboratory, and its quality (defective or nondefective) 
can be reported before the screen/no-screen decision must be made. The cost of this initial 
inspection is $125. 
(a) What action does Bayes’ procedure recommend without looking at the single item? 
(b) How much money is it worthwhile to pay for *‘perfect information’’? 
(c) What is the optimal expected cost if the quality of items is determined before the 
screen/no-screen decision is made? 
(d) What action does Bayes’ procedure recommend if the quality of one of the items is 
determined and found to be defective? 


5. Use the scenario given in Prob. 4. 

(a) Draw and properly label the decision tree. 

(b) Evaluate the decision tree. 

(c) Determine the optimal policy. 

(d) How much money is it worthwhile to pay for ‘‘perfect information’? 
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6.* Assume that there are two weighted coins. Coin 1 has a probability of 0.3 of turning 
up heads, and coin 2 has a probability of 0.6 of turning up heads. A coin is tossed once. The 
decision maker must decide which coin was tossed. The probability that coin 1 was tossed is 
0.6, and the probability that coin 2 was tossed is 0.4. The loss matrix is as follows: 


Coin 1 Tossed | Coin 2 Tossed 
1 





a: say coin 1 tossed 
a: say coin 2 tossed 





(a) What is the Bayes’ procedure (action) before the coin is tossed? 
(b) What is the Bayes’ procedure if the outcome is heads? What if it is tails? 


7. Use the scenario given in Prob. 6. 

(a) Draw and properly label the decision tree. 
(b) Evaluate the decision tree. 

(c) Determine the optimal policy. 


8.* A company has developed a new chip that will enable it to enter the microcomputer 
field if it so desires. Alternatively, it can sell its rights for $800,000. If it chooses to build 
computers, the profitability of the venture depends upon the company’s ability to market the 
microcomputer during the first year. It has sufficient access to retail outlets so that it can 
guarantee sales of 1,000 computers. On the other hand, if this computer catches on, it can sell 
10,000 machines. The company believes that.both sales alternatives are equally likely and that 
all other alternatives are negligible. The cost of setting up the assembly line is $600,000. The 
difference between the selling price and. the variable cost is $600. Market research can be 
performed at a cost of $400,000. to. determine which of the two levels of demand is more 
realistic. Previous experience indicates. that such market research is correct two-thirds of the 
time. 

(a) What action does Bayes’ procedure. recommend, assuming market research is not 

used? 

(b) How much money is it worthwhile to pay for ‘‘perfect information’’? 

(c) What is the optimal expected loss if market research is used? Does it pay to use 

market research? 

(d) What action does Bayes’ procedure recommend: if market research determined that 

only 1,000 computers will be sold? 


9. Use the scenario given in Prob. 8. 

(a) Draw and properly label the decision tree. 

(b) Evaluate the decision tree. 

(c) Determine the optimal policy. 

(d) How much money is it worthwhile to pay for ‘‘perfect information’’? 


10. A new type of camera film has been developed. It is packaged in sets of five sheets, 
each sheet providing an instantaneous snapshot. Because this process is new, the manufacturer 
has attached an additional sheet to the package, so that the store may. test one sheet before it 
sells the package of five. In promoting the film, the manufacturer offers to refund the entire 
purchase price of the film if one of the five is defective. This refund must! be paid by the camera 
store, and the selling price has been fixed at $1 if this guarantee is to be valid. The camera 
store may sell the film for 50 cents if the preceding guarantee is replaced by one that pays 10 
cents for each defective sheet. The cost of the film to the camera store is 20 cents, and the film 
is not returnable. The store may take three actions: 


a: scrap the film, 
a,: sell the film for $1, 
az: sell the film for 50 cents. 


(a) If the six states of nature correspond to 0, 1, 2, 3, 4, 5 defective sheets in the 853 
package, complete the following loss table: 


Decision Analysis 








(b) 










Quality of 
Attached Sheet 
Good 


Bad 
Total 








These data indicate that each state of nature is equally likely, so that this prior can 
be assumed. What is Bayes’ procedure (before testing the attached sheet) for a 
package of film? 

(c) What is the optimal expected loss for a package of film if the attached sheet is 
tested? What action does Bayes’ procedure recommend if the sheet is good? If it is 
bad? 


11. There are two biased coins with probabilities of landing heads 0.8 and 0.4, respec- 
tively. One of the coins is chosen at random (each with probability 3), and you are to receive 
$100 if you correctly predict how many heads will occur in two coin tosses. 

(a) Using Bayes’ criterion, what would you predict, and what is the corresponding 

= expected gain? 

(b) Suppose now that you may observe a practice coin toss before predicting. Using 
Bayes’ criterion, what would you predict based on each of the possible outcomes 
of the practice toss? 

(c) What is your expected gain with the practice toss? 


12. A private university is considering whether or not to hold an extensive centennial 
campaign next fall to raise funds for a new athletic field. The response to the campaign depends 
heavily upon the success of this spring’s varsity baseball team. If the baseball team has a 
winning season (W), many of the alumni will contribute, and the campaign will raise $3 million. 
If the team has a losing season (L), few will contribute, and the campaign will lose $2 million. 
If no campaign is undertaken, no costs are incurred. 

(a) Based upon past performance, the baseball team has had winning seasons 60 percent 

of the time. How much should the president of the university be willing to pay for 
““‘perfect information’’? 

(b) Now there is the possibility of hiring a professional talent scout to look at the baseball 
team before March 25, the date when both the baseball season begins and the 
decision of whether or not to hold the campaign must be made, For $100,000, this 
talent scout correctly predicts, three-fourths of the time, what kind of season, W or 
L, a team will have. Find the posterior distribution of the outcome of the baseball 
team’s season, given that the scout’s prediction is for (1) a winning season, W,, and 
for (2) a losing season, Lp. 

(c) Evaluate the decision tree. Should the president pay $100,000 for the scout’s pre- 
diction? 
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13. An emerging presidential candidate is considering whether or not to run in the high- 
stakes Super Tuesday primaries. If he enters the Super Tuesday (S.T.) primaries, he and his 
advisors believe that he will either do well (finish first or second) or do poorly (finish third or 
worse) with probabilities 0.4 and 0.6, respectively. Doing well on Super Tuesday will net the 
candidate’s campaign approximately $4 million, whereas a poor showing will mean a loss of 
$2.5 million after paying for numerous TV ads. Alternatively, he may choose not to run at all 
on Super Tuesday and incur no costs. 

The candidate’s advisors realize that his chances of success on Super Tuesday may be 
affected by the outcome of the smaller New Hampshire (N.H.) primary occurring three weeks 
before Super Tuesday. Political analysts feel that the results of New Hampshire’s primary are 
correct two-thirds of the time in predicting the results of the Super Tuesday primaries. Among 
the advisors is a decision analysis expert who uses this information to calculate the following 
probabilities: 


P{candidate does well in S.T. primaries, candidate does well in N.H.} = 4 


P{candidate does well in S.T. primaries, candidate does poorly in N.H.} = 4 


P{candidate does well in N.H. primary} = i 
The cost of entering and campaigning in the New Hampshire primary is estimated to be 
$400,000. 
(a) Draw and properly label the decision tree. 
(b) Evaluate the decision tree. 
(c) Determine the strategy that minimizes the candidate’s expected losses. 
(d) How much money is it worthwhile to pay for ‘‘perfect information’’? 


14. A doctor has a seriously ill patient but has had trouble diagnosing the specific cause 
of the illness. The doctor now has narrowed the cause down to two alternatives: disease A or 
disease B. Based on the evidence so far, she feels that the two alternatives are equally likely. 

Beyond the testing already done, there is no test available to determine if the cause is 
disease B. One test is available for disease A, but it has two major problems. First, it is very 
expensive. Second, it is somewhat unreliable, giving an accurate result only 80 percent of the 
time. Thus, it will give a positive result (indicating disease A) for only 80 percent of patients 
who have disease A, whereas it will give a positive result for 20 percent of patients who actually 
have disease B instead. 

Disease B is a very serious disease with no known treatment. It is sometimes fatal, and 
those who survive remain in poor health with a poor quality of life thereafter. The prognosis 
is similar for victims of disease A if it is. left untreated. However, there is a fairly expensive 
treatment available that eliminates the danger of death for those with disease A, and it may 
return them to good health. Unfortunately, it is a relatively radical treatment that always leads 
to death if the patient actually has disease B instead. 

The probability distribution for the prognosis for the current patient is given for each 
case in the following table, where the column headings (after the first one) indicate the disease 
for this patient. 





Outcome Probabilities 


No Treatment 













Receive Treatment 
for Disease A 





Outcome 







Survive with 
poor health 

Return to 

good health 


The patient has assigned the following utilities to the possible outcomes. 










Outcome 





Utility 


Die 0 
Survive with poor health 10 
Return to good health 30 





In addition, these utilities should be incremented by —2 if the patient incurs the cost .of the 
test for disease A and by —1 if the patient (or the patient’s estate) incurs the cost of the 
treatment for disease A. 

Use decision analysis with a complete decision tree to determine if the patient should 
undergo the test for disease A and then how to proceed (receive the treatment for disease A?) 
in order to maximize the patient’s expected utility. 


15. Solve the oil-drilling example if the profit per barrel of oil is increased to $4. 


16.* Solve the oil-drilling example by using the prior distribution formed from the data 
in Table 22.3. 


17. Solve the carnival example if the cost of using the Weather Bureau’s data is reduced 
to $400. 


18. Refer to the scenarios in the appropriate problems and assume that no experimental 
data are available; draw, properly label, and evaluate the decision tree: 

(a) For Prob. 2; 

(b) For Prob. 4; 

(c) For Prob. 6; 

(d) For Prob. 8. 
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Simulation 


The technique of simulation has long been an important tool of the designer. For 
example, simulating airplane flight in a wind tunnel is standard practice when de- 
signing a new airplane. Theoretically, the laws of physics could be used to obtain the 
same information about how the performance of the airplane changes as design pa- 
rameters are altered, but, as a practical matter, the analysis would be too complicated 
to do it all. Another alternative would be to build real airplanes with alternative designs 
and test them in actual flight to choose the final design, but this would be far too 
expensive (as well as unsafe). Therefore, after performing some preliminary theoret- 
ical analysis to develop a rough design, simulating flight in a wind tunnel is a vital 
tool for experimenting with specific designs. This simulation amounts to imitating the 
performance of a real airplane in a controlled environment in order to estimate what 
its actual performance would be. After developing a detailed design in this way, a 
prototype model can then be built and tested in actual flight in order to fine-tune the 
final design. 
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Simulation plays essentially this same role in many operations research studies. 
However, rather than designing an airplane, the operations research team is concerned 
with developing a design (or operating policy) for some stochastic system (a system 
that evolves probabilistically over time). Some of these stochastic systems resemble 
the examples of Markov chains and queueing systems described in Chaps. 15-17, 
and others are more complicated. Rather than using a wind tunnel, the performance 
of the real system is imitated by using probability distributions to randomly generate 
the various events that occur in the system. Therefore, a simulation model synthesizes 
the system by building it up component by component and event by event. It then 
runs the simulated system to obtain statistical observations of the performance of the 
system that result from the various randomly generated events. Because the simulation 
runs typically require generating and processing a vast amount of data, these simulated 
statistical experiments normally are performed on a computer. 

When simulation needs to be used as part of an operations research study, it 
commonly is preceded and followed by the same steps as for the airplane design. In 
particular, some preliminary theoretical analysis is done first to develop a rough design 
of the system. Then simulation is used to experiment with specific designs in order 
to estimate what the actual performance of the system would be. After developing a 
detailed design in this way, the system is tested in actual use in order to fine-tune the 
final design. 

Operations research teams typically use simulation when the stochastic system 
involved is too complex to be analyzed satisfactorily by the kinds of analytical models 
described in Chaps. 15-22. One of the main strengths of the analytical approach is 
that it abstracts the essence of the problem and reveals its underlying structure, thereby 
providing insight into the cause-and-effect relationships within the system. Therefore, 
if it is possible to construct an analytical model that is both a reasonable idealization 
of the problem and amenable to solution, this approach usually is superior to simu- 
lation. However, many problems are so complex that they cannot be solved analyti- 
cally. Thus, even though simulation tends to be a relatively expensive procedure, it 
often provides the only practical approach to a problem. 

In essence, then, the operations research view of simulation is that it is a con- 
trolled statistical sampling technique for estimating the performance of complex sto- 
chastic systems when analytical models do not suffice. Rather than describing the 
overall behavior of the system directly, the simulation model describes the operation 
of the system in terms of individual events of the individual components of the system. 
In particular, the system is divided into elements whose behavior can be predicted, 
at least in terms of probability distributions, for each of the various possible states of 
the system and its inputs. The interrelationships among the elements also are built into 
the model. 

Thus simulation provides a means of dividing the model-building job into 
smaller component parts that can be formulated more readily (e.g., a component part 
might be a simple queueing system) and then combining these component parts in 
their natural order. After constructing the model, we can then activate it by using 
random numbers to generate simulated events over time according to the appropriate 
probability distributions. The result is a simulation of the actual operation of the system 
over time, and we can record its aggregate behavior. By repeating this process for 
the various alternative configurations for the design and operating policies of the 
system, and by comparing their performance, we can identify the most promising 
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configurations. Because of statistical error, it is impossible to guarantee that the con- 
figuration yielding the best simulated performance is indeed the optimal one, but it 
should be at least near optimal if the simulated experiment was designed properly. 

Thus simulation typically is nothing more or less than the technique of perform- 
ing sampling experiments on the model of the system. The experiments are done on 
the model rather than on the real system itself only because the latter would be too 
inconvenient, expensive, and time consuming. Otherwise, simulated experiments 
should be viewed as virtually indistinguishable from ordinary statistical experiments, 
so they also should be based upon sound statistical theory. 

This chapter focuses on discrete event simulations (as opposed to continuous 
simulations), i.e., simulations where changes in the state of the system occur at random 
points in time (as opposed to continuously) as a result of the occurrence of discrete 
events. The basic building blocks of a model for a discrete event simulation are the 
possible states and events, a simulation clock for recording the passage of (simulated) 
time, a mechanism for randomly generating the different kinds of events, and a mech- 
anism for then generating state transitions. 

These building blocks are described in Sec. 23.1 in the context of illustrative 
examples. The second section elaborates on the formulation and implementation of 
simulation models. The next two sections then focus on the design and analysis of 
the program of statistical experimentation inherent in a simulation study. 


23.1 Illustrative Examples 


This section uses some relatively simple stochastic systems to introduce and illustrate 
some basic concepts of simulation. The first system is so simple, in fact, that the 
simulation does not even need to be performed on a computer. The second system 
incorporates more of the normal- features of a simulation, although it too is simple 
enough to be solved analytically. Following these two examples, we then survey some 
more typical applications of simulation. 


Example 1—A Coin-Flipping Game 


Suppose you were offered a chance to play a game whereby you would repeatedly 
flip an unbiased coin until the difference between the number of heads tossed and the 
number of tails tossed is three. You would be required to pay $1 for each flip of the 
coin, but you would receive $8 at the end of each play of the game. You are not 
allowed to quit during a play of the game. Thus you win money if the number of flips 
required is fewer than eight, but you lose money if more than eight flips are required. 
How would you decide whether or not to play this game? 

Many people would base this decision on simulation, although they probably 
would not call it by that name. (There is also an analytical solution for this game, but 
it is not a particularly elementary one.) In this case, simulation amounts to nothing 
more than playing the game alone many times until it becomes clear whether it is 
worthwhile playing for money. Half an hour spent in repeatedly flipping a coin and 
recording the earnings or losses that would have resulted might be sufficient. This. is 
a true simulation because you are imitating the actual play of the game without actually 
winning or losing any money. 

How would this simulated experiment be executed on a computer? Although 


the computer cannot flip coins, it can generate numbers. Therefore, it would generate 
(or be given) a sequence of random digits, each of which would correspond to a flip 
of a coin. (The generation of random numbers is discussed in Sec. 23.2.) The prob- 
ability distribution for the outcome of a flip is that the probability of a head is 3 and 
the probability of a tail is , whereas there are 10 possible values of a random digit, 
each having a probability of qo. Therefore, five of these values (say, 0, 1, 2, 3, 4) 
would be assigned an association with a head and the other five (say, 5, 6, 7, 8, 9) 
with a tail. Thus the computer would simulate the playing of the game by examining 
each new random digit generated and labeling it a head or a tail, according to its 
value. It would continue doing this, recording the outcome of each simulated play of 
the game, as long as desired. 

To illustrate the computer approach to this simulated experiment, we suppose 
that the computer generated the following sequence of random digits: 


, 3,7, 2,7, 1, 6,5, 5, 7, 9, 0, 0, 3, 4, 3, 5, 6, 8,5 
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Thus, denoting a head by H and a tail by T, the first simulated play of the game is 
THHTHTHTTTT, requiring 11 simulated flips of a coin. The subsequent simulated 
plays of the game require 5, 5, 9, 7, 7, 5, 3, 17, 5, 5, 3, 9, and 7 simulated flips, 
respectively. This experiment has a sample size of 14 (14 simulated plays of the 
game), where the individual observations are the number of flips required for a play 
of the game. One useful statistic is 

lit S+---+7 


Sampl = = 7, 
ample average 14 7 





because the sample average provides an estimate of the true mean of the underlying 
probability distribution. 

This sample average of 7 would seem to indicate that, on the average, you 
should win about $1 each time you play the game. Therefore, if you do not häve a 
relatively high aversion to risk, it appears that you should choose to play this game, 
preferably a large number of times. 

However, beware! One of the common errors in the use of simulation is that 
conclusions are based on overly small samples, because statistical analysis was in- 
adequate or totally lacking. In this case, the sample standard deviation is 3.67, so 
that the estimated standard deviation of the sample average is 3.67/ V14 ~ 0.98. 
Therefore, even if it is assumed that the probability distribution of the number of flips 
required for a play of the game is a normal distribution (which is a gross assumption 
because the true distribution is skewed), any reasonable confidence interval for the 
true mean of this distribution would extend far above 8. Hence a much larger sample 
size is required before we can draw a valid conclusion at a reasonable level of statistical 
significance. Unfortunately, because the standard deviation of a sample average is 
inversely proportional to the square root of the sample size, a large increase in the 
sample size is required to yield a relatively small increase in the precision of the 
estimate of the true mean. In this case, it appears that an additional 100 simulated 
plays of the game might be adequate. 

It so happens that the true mean of the number of flips required for a play of 
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this game is 9..Thus, in the long-run, you- actually would lose about $1 each time 
you played the game. 

Although formally constructing. a: full-fledged simulation model is not really 
necessary for this simple simulation, we now shall do so for illustrative purposes. The 
stochastic system being simulated is the successive flipping of the coin for a play of 
the game. The simulation clock records the number of (simulated). flips.t that have 
occurred so far. The information about the system that defines its current status, i.e., 
the state of the system, is 


N(t) = number of heads minus number of tails after ¢ flips. 


The events that change the state of the system are the flipping of a head or the flipping 
of a tail. The event generation mechanism. is the generation of a random digit where 


0-4 > a head, 
5-9 > a tail. 
The state transition mechanism is to set 


Gore Mt—-1)+1, _ if flip ris ahead 
Nt- 1-1, | if flip risa tail. 


The simulated game then ends at the first value of t where M(t) = +3, where the 
resulting sampling observation for the simulated experiment is (8 — 1), the amount 
won (positive or negative) for that play of the game. 

The next example will illustrate these building blocks of a simulation model for 
a more typical stochastic system. 


Example 2—An M/M/1 Queueing System 


Consider the M/M/1 queueing theory model. (Poisson input, exponential service 
times, and single server) that was discussed at the beginning of Sec. 16.6.: Although 
this model already has been solved analytically, it will be instructive to consider how 
to study it using simulation. To be specific, suppose that the values of the arrival rate 
A and service rate u are 


A = 5 per hour, = 3 per hour. 


To summarize the physical operation of the system, arriving customers enter 
the queue, eventually are serviced by the server, and then leave. Thus it is necessary 
for the simulation model to describe and synchronize the arrival of customers and the 
servicing of customers. 

Starting at time 0, the simulation clock records the amount of (simulated) time 
t that has transpired: so far during the simulation run. The information about the 
queueing system that defines its current status; i.e., the state of the system, is 


N(t) = number of customers in the system at time t. 


The events that change the state of the system are the arrival of a customer or 
a service completion for the customer currently in service (if any). We shall describe 
the event generation mechanism a little later. The state transition mechanism is to 


Nt) + 1, if an arrival occurs: at time t 
Reset M(t) = . ‘ 3 r: 
Nt) — 1, if a service completion occurs at time f. 


There are two basic methods used for advancing the simulation clock and re- 
cording the operation of the system. We did not distinguish between these methods 
for Example 1 because they actually coincide for that simple situation. However, we 
now shall describe and illustrate these two time advance mechanisms (fixed-time 
incrementing and next-event incrementing) in tum. 

With the fixed-time incrementing time advance mechanism, the following two- 
step procedure is used repeatedly. 


Summary of Fixed-Time Incrementing 


E 1. Advance time by a small fixed amount. 
2. Update the system by determining what events occurred during the elapsed 
time interval and what the resulting state of the system is. Also record desired 
information about the performance of the system. 


For the queueing theory model under consideration, only two types of events 
can occur during each of these elapsed time intervals, namely, one or more arrivals 
and one or more service completions. Furthermore, the probability of two or more 
arrivals or of two or more service completions during an interval is negligible for this 
model if the interval is relatively short. Thus the only two possible events during such 
an interval that need to be investigated are the arrival of one customer and the service 
completion for one customer. Each of these events has a known probability. 

To illustrate, let us use 0.1 hour (6 minutes) as the small fixed amount by which 
the clock is advanced each time. (Normally, a considerably smaller time interval would 
be used to make negligible the probability of multiple arrivals or multiple service 
completions, but this choice will create more action for illustrative purposes.) Because 
both interarrival times and service times have an exponential distribution, the proba- 
bility P, that a time interval of 0.1 hour will include an arrival is 


P, = 1 — e77/!9 = 0.259, 


and the probability Pp that it will include a departure (service completion), given that 
a customer was being served at the beginning of the interval, is 


Py, = 1 — e7 5/0 = 0,393. 


To randomly generate either kind of event according to these probabilities, the 
approach is similar to that in Example 1. The computer again is used to generate a 
random number, but this time with multiple digits rather than one. Placing a decimal 
point in front of the number then makes it a uniform random number on (0, 1), that 
is, a random observation from the uniform distribution between 0 and 1. Denoting 
this uniform random number by r,, 


r, < 0.259 > an arrival occurred, 
r, = 0.259 > an arrival did not occur. 


Similarly, with another uniform random number rp, 


rp < 0.393 > a departure occurred, 


rp = 0.393 => a departure did not occur, 


given that a customer was being served at the beginning of the time interval. With 
no customer in service then (i.e., no customers in the system), it is assumed that no 
departure can occur during the interval even if an arrival does occur. 
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Table 23.1 shows the result of using this approach for 10 iterations of the fixed- 
time incrementing procedure, starting with no customers in the system and using time 
units of minutes. 

Step 2 of the procedure (updating the system) includes. recording the. desired 
measures of performance about the aggregate behavior of the system during this time 
interval. For example, it could record the number of customers in the queueing system 
and the waiting time of any customer who just completed his or her wait. If it is 
sufficient to estimate only the mean rather than the probability distribution of each of 
these random variables, the computer would merely add the value (if any) at the end 
of the current time interval to a cumulative sum. The sample averages would be 
obtained after the simulation run was completed by dividing these sums by the sample 
sizes involved, namely, the total number of.time intervals and the total number of 
customers, respectively. 

Next-event incrementing differs from fixed-time incrementing in that the sim- 
ulation clock is incremented by a variable amount rather than by a fixed amount each 
time. This variable amount is the time from the event that has just occurred until the 
next event of any kind occurs; i.e., the clock jumps from event to event. A summary 
follows. 


Summary of Next-Event Incrementing 


1. Advance time to the time of the next event of any kind. 

2. Update the system by determining its new state that results from this event 
and by randomly generating the time until the next occurrence of any event 
type that can occur from this state (if not previously generated). Also record 
desired information about the performance of the system. 


For this example the computer needs to keep track of two future events, namely, 
the next arrival and the next service completion (if a customer currently is being 
served). These times are obtained by taking a random observation from the probability 
distribution of interarrival and service. times, respectively. As before, the computer 
takes such a random observation by generating and using a random number. (This 
technique will be discussed in Sec. 23.2.) Thus, each time an arrival or service 
completion occurs, the computer determines how long it will be until the next time 
this event will occur, adds. this time to the current clock time, and then stores this 
sum in a computer file. (If the service completion leaves no customers in the system, 


Table 23.1 Applying Fixed-Time Incrementing to Example 2 








Arrival in Departure 
t (min) Ni) Ta Interval? in Interval? 
0 0 
6 1 0.096 Yes 
12 1 0.569 No No 
18 1 0.764 No No 
24 0 0.492 No Yes 
30 0 0.950 No 
36 0 0.610 No 
42 1 0.145 Yes 
48 1 0.484 No No 
54 1 0.350 No No 
60 0 0.430 No Yes 








then the generation of the time until the next service completion is postponed until 
the next arrival occurs.) To determine which event will occur next, the computer finds 
the minimum of the clock times stored in the file. To expedite the bookkeeping 
involved, simulation programming languages provide a ‘‘timing routine’’ that deter- 
mines the occurrence time and type of the next event, advances time, and transfers 
control to the appropriate subprogram for the event type. 

Table 23.2 shows the result of applying this approach through five iterations of 
the next-event incrementing procedure, starting with no customers in the system and 
using time units of minutes. For later reference, we include the uniform random 
numbers r, and rp used to generate the interarrival times and service times, respec- 
tively, by the method to be described in Sec. 23.2. These r, and rp are the same as 
used in Table 23.1 in order to provide a truer comparison between the two time 
advance mechanisms. 

The next-event incrementing procedure is considerably better suited for this 
example and similar stochastic systems than the fixed-time incrementing procedure. 
Next-event incrementing requires fewer iterations to cover the same amount of simu- 
lated time, and it generates a precise schedule for the evolution of the system rather 
than a rough approximation. 

The next-event incrementing procedure will be illustrated again in Sec. 23.4 
(see Table 23.11) in the context of a full statistical experiment for estimating certain 
measures of performance for another queueing system. 

Several pertinent questions about how to conduct a simulation study of this type 
still remain to be answered. These answers are presented in a broader context in 
subsequent sections. 


Typical Applications 


During the early 1980s, a survey was made of many leading American firms to learn 
more about their use of simulation, as reported by David Christie and Hugh Watson 
in Selected Reference 4. One major finding was the identification of the functional 
areas of the company where simulation was being applied. The results are shown in 
Table 23.3, where production (manufacturing) leads the list, closely followed by 
corporate planning, engineering, finance, and research and development. As the per- 
centages indicate, many of the companies are applying simulation in most of these 
areas. 

This survey also found that the development of simulation models had spread 
far beyond centralized operations research (or management science) departments. In 
fact, in 54 percent of the companies, simulation models were being created in func- 


Table 23.2 Applying Next-Event Incrementing to Example 2 
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Table 23.3 Percentage of Surveyed 
Companies Using Simulation in 
Certain Functional Areas 





Functional Area Percentage 
Production 59% 
Corporate planning 53% 
Engineering 46% 
Finance 41% 
Research and development 37% 
Marketing 24% 
Data processing 16% 
Personnel 10% 





tional area departments, and corporate planning departments were doing so in 30 
percent of the companies. The clear majority of both kinds of departments, as well 
as OR departments, reported that they perceived the results of their simulation appli- 
cations to be ‘‘good,’’ as opposed to ‘‘fair’’ or ‘‘poor.”’ 

More recent reports (e.g., see Selected Reference 9) indicate that the use of 
simulation continues to grow rapidly in the production (manufacturing) area, as well 
as in the corporate planning and finance areas. Growth in the latter areas has been 
aided by the recent development of specialized simulation programming languages 
for financial planning. 

Another important recent development has been the increasing use of computer 
graphics to generate animated displays of the movement of entities through the simu- 
lated system. For example, in the simulation of manufacturing systems, the moving 
entities can represent components being manufactured and various materials-handling 
devices such as automated guided vehicles. The computer graphics provide greater 
insight into the performance of the system for any given design, and they also add 
credibility to the results of the simulation study. 

There have been numerous applications of simulation in a wide variety of con- 
texts. Some examples are listed here to illustrate the great versatility of this technique: 


1. Simulation of the operations at a large airport by an airline company to test 
changes in company policies and practices (e:g., amounts of maintenance 
capacity, berthing facilities, spare aircraft, and so on). 

2. Simulation of the passage of traffic across a junction with time-sequenced 
traffic lights to determine the best time sequences. 

3. Simulation of a maintenance operation to determine the optimal size of 
repair crews. 

4. Simulation of the flux of uncharged particles through a radiation shield to 
determine the intensity of the radiation that penetrates the shield. 

5. Simulation of steel-making operations to evaluate changes in operating 
practices and the capacity and configuration of the facilities. 

6. Simulation of the U.S. economy to predict the effect of economic policy 
decisions. 

7. Simulation of large-scale military battles to evaluate defensive and offensive 
weapons systems. 

8. Simulation of large-scale distribution and inventory control systems to im- 

`~ prove the design of these systems. 


9, Simulation of the overall operation of an entire business firm to evaluate 
broad changes in the policies and operation of the firm and also to provide 
a business game for training executives. 

10. Simulation of a telephone communications system to determine the capacity 
of the respective components required to provide satisfactory service at the 
most economic level. 

11. Simulation of the operation of a developed river basin to determine the best 
configuration of dams, power plants, and irrigation works to provide the 
desired level of flood control and water-resource development. 

12. Simulation of the operation of a production line to determine the amount 
of in-process storage space that should be provided. 


23.2 Formulating and Implementing a Simulation Model 


Constructing the Model 


The first step in a simulation study is to develop a model representing the system to 
be investigated. This step requires the analyst to become thoroughly familiar with the 
operating realities of the system and the objectives of the study. Given this require- 
ment, the analyst probably would attempt to reduce the real system to a logical flow 
diagram. The system is thereby broken down into a set of components linked together 
by a master flow diagram, where the components themselves may be broken down 
into subcomponents, and so on. Ultimately the system is decomposed into a set of 
elements for which operating rules may be given. These operating rules predict the 
events that will be generated by the corresponding elements, perhaps in terms of 
probability distributions. After specifying these elements, rules, and logical linkages, 
the analyst needs to test the model thoroughly piece by piece. This testing can be 
done partially by performing a gross version of the simulation on a calculator and 
checking whether each input is received from the appropriate source and whether each 
output is acceptable to the next submodel. However, the individual components of 
the model also should be tested alone to verify that their internal performance is 
reasonably consistent with reality. 

It should be emphasized that, like any operations research model, the simulation 
model need not be a completely realistic representation of the real system. In fact, it 
appears that most simulation models err on the side of being overly realistic rather 
than overly idealized. With the former approach, the model easily degenerates into a 
mass of trivia and meandering details, so that a great deal of programming and com- 
puter time is required to obtain a small amount of information. Furthermore, failing 
to strip away trivial factors to get down to the core of the system may obscure the 
significance of those results that are obtained. 

If the behavior of an element cannot be predicted exactly, given the state of the 
system, it is better to take random observations from the probability distributions 
involved than to use averages to simulate the performance of this element. This 
statement is true even when one is interested only in the average aggregate per- 
formance of the system, because combining average performances for the individual 
elements may result in something far from average for the overall system. 

One question that may arise when choosing probability distributions for the 
model is whether to use frequency distributions of historical data or to seek the 


865 


Simulation 


866 
Probabilistic Models 


theoretical probability distribution that best fits these data. The latter alternative usually 
is preferable because it avoids reproducing the idiosyncrasies. of a certain period in 
the past. 

When constructing the building blocks of a simulation model described and 
illustrated in the preceding section, one key step is the definition of the state of the 
system. The state must include the relevant information about the current status of the 
system so that generating the simulated evolution of the system based upon the state 
provides an accurate representation of the behavior of the real system. The-state.must_ 
also allow measuring and combining quantities that yield meaningful estimates for 
measures of performance of the system. Frequently, the state must be represented by 
a vector (a set of state variables) rather than a single variable. For complex stochastic 
systems, there sometimes are alternative reasonable definitions for the state of the 
system. 


Generating Random Numbers 


As the examples in Sec. 23.1 demonstrated, implementing a simulation model requires 
random numbers to obtain random observations from probability distributions. One 
method for generating such random numbers is to use a physical device such as a 
spinning disk or an electronic randomizer. Several tables of random numbers have 
been generated in this way, including one containing | million random digits, pub- 
lished by the Rand Corporation. An excerpt from the Rand table is given in Table 
23.4. 

When performing a simulation on a computer. the needed random numbers 
normally are generated directly by the computer by using a ‘‘random number gener- 
ator.” A random number generator is an algorithm that produces sequences of 


Table 23.4 Table of Random Digits 


09656 96657 64842 49222 49506 10145 48455 23505 90430 04180 
24712 55799 60857 73479 3358] 17360 30406 05842 72044 90764 
07202 96341 23699 76171 79126 04512 15426 15980 88898 06358 
84575 46820 54083 43918 46989 05379 70682 43081 66171 38942 
38144 87037 46626 70529 27918 34191 98668 33482 43998 75733 


48048 56349 01986 29814 69800 91609 65374 22928 09704 59343 
41936 58566 31276 19952 01352 18834 99596 09302 20087 19063 
73391 94006 03822 81845 76158 41352 40596 14325 27020 17546 
57580 08954 73554 28698 29022 11568 35668 59906 39557 27217 
92646 41113 91411 56215 69302 86419 61224 41936 56939 27816 


07118 12707 35622 81485 73354 49800 60805 05648 28898 60933 
57842 57831 24130 75408 83784 64307 91620 40810 06539 70387 
65078 44981 81009 33697 98324 46928 34198 96032 98426 77488 
04294 96120 67629 55265 26248 40602 25566 12520 89785 93932 
48381 06807 43775 09708 73199 53406 02910 83292 59249 18597 


00459 62045 19249 67095 22752 24636 16965 91836 00582 46721 
38824 81681 33323 64086 55970 04849 24819 20749 51711 86173 
91465 22232 02907 01050 07121 53536 71070 26916 47620 01619 
50874 00807 77751 73952 03073 69063 16894 85570 81746 07568 
26644 75871 15618 50310 72610 66205 82640 86205 73453 90232 





Source: Reproduced with permission from The Rand Corporation: A Million Random Digits with 
100,000 Normal Deviates. Copyright, The Free Press, Glencoe, Ill..-1955, top of p. 182. 


numbers that follow a specified probability distribution and possess the appearance of 
randomness. The reference to “‘sequence of numbers’’ means that the algorithm pro- 
duces many random numbers in a serial manner. Although an individual user may 
need only relatively few of the numbers, it is generally required that the algorithm be 
capable of producing many numbers. ‘‘Probability distribution’’ implies that a prob- 
ability statement can be associated with the occurrence of each number produced by 
the algorithm. The probability distribution is usually taken to be the uniform distri- 
bution between 0 and 1, in which case the numbers generated by the algorithm may 
be called uniform random numbers or simply random numbers. 

Strictly speaking, the numbers generated by the computer should not be called 
random numbers because they are predictable and reproducible (which sometimes is 
advantageous), given the random number generator being used. Therefore, they are 
sometimes given the name pseudo-random numbers. However, the important point 
is that they satisfactorily play the role of random numbers in the simulation if the 
method used to generate them is valid. 

Various relatively sophisticated statistical procedures have been proposed for 
testing whether a generated sequence of numbers has an acceptable appearance of 
randomness.' Basically the requirements are that each successive number in the se- 
quence must have an equal probability of taking on any one of the possible values, 
and it must be statistically independent of the other numbers in the sequence. In other 
words, the numbers need to be random observations from a uniform distribution. 

There are a number of random number generators available, of which the most 
popular are the congruential methods (additive, multiplicative, and mixed). The mixed 
congruential method includes features of the other two, so we shall discuss it first. 

The mixed congruential method generates a sequence of random numbers by 
always calculating the next random number from the last one obtained, given an initial 
random number x (called the seed), which may be obtained from some published 
source such as the Rand table. In particular, it calculates the (n + 1)st random number, 
X,4 1» from the nth random number, x,, by using the recurrence relation 


X41 = (ax, + c)(modulo m), 


.where a, c, and m are positive integers (a < m, c < m). This mathematical notation 
signifies that x,, , is the remainder when (ax, + c) is divided by m. Thus the possible 
values of x,,,; are 0, 1,...,m — 1, so that m represents the desired number of 
different values that could be generated for the random numbers. 

To illustrate, suppose that m = 8, a = 5, c = 7, and x) = 4. The resulting 
sequence of random numbers is calculated in Table 23.5. (The sequence cannot be 
continued further because it would just begin repeating the numbers in the same order.) 
Note that this sequence includes each of the eight possible numbers exactly once. This 
property is desirable, but it does not occur with some choices of a and c. (Try a = 
4,c = 7, X% = 3.) Fortunately, there are rules available for choosing values of a 
and c that will guarantee this property. (There are no restrictions on the seed, xo, 
because it affects only where the sequence begins and not the progression of numbers.) 

For a binary computer with a word size of b bits, the usual choice for m is 
m = 2°; this is the total number of nonnegative integers that can be expressed within 
the capacity of the word size. (Any undesired integers that arise in the sequence of 


1 See Selected References 6 and 14 for further information about these tests and the generation of random 
numbers. 
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Table 23.5- Wustration of Mixed 
Congruential Method 
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random numbers are just not used.) With this choice of m, we can ensure. that each 
possible number occurs exactly once before any. number is repeated by selecting any 


of the values a = 1,5, 9, 13,...andc = 1,3, 5,7, .... . For a decimal computer 
with a word size of d digits, the usual choice for m is m = 10%, and the same property 
is ensured by selecting any of the values a = 1, 21,41, 6l,... and c = 


1,3, 7,9, 11, 13, 17, 19, . . . (that is, all positive odd integers except those ending 
with the digit 5). The specific selection can be made on the basis of the serial cor- 
relation between successively generated numbers, which differs considerably among 
these alternatives. ! 

Occasionally, random numbers with only a relatively small number of digits are 
desired. For example, suppose that only: three digits are desired, so that the possible 
values can be expressed as 000, 001, ..... , 999. In sucha case, the usual procedure 
still is to use m = 2° or m = 10%,.so that an extremely large number of random 
numbers can be generated before the sequence starts repeating itself. However, except 
for purposes of calculating the next random number, all but three digits of each random 
number would be discarded. One convention is to take the Jast three digits (i.e., the 
three trailing digits). 

The multiplicative congruential method is just the special case of the mixed 
congruential method where c = 0. The additive congruential method also is similar, 
but it sets a = 1 and replaces c by some random number preceding x, in the sequence, 
for example, x,_, (so that more than one seed is required to start calculating the 
sequence). 

Among the possible random number generators (choices of a and m) based on 
the. multiplicative congruential method, perhaps the most widely used is the Lear- 
mouth-Lewis generator, 


Ing1 = Px, (modulo 2°! — 1); 


This generator has been tested extensively, and the results of the statistical tests 
indicate that it is very satisfactory. Versions of this generator are used, e.g., in IBM 
versions of APL, in the International Mathematics and Statistics Library (IMSL) pack- 
age, and in the random number generator. package LLRANDOM. Tables of suitable 
seeds also are available. 


1 See Coveyou, R. R.: ‘‘Serial Correlation in the Generation of Pseudo-Random Numbers,” Journal of 
the Association of Computing Machinery, 7:72-74, 1960. 


Generating Random Observations from a Probability Distribution 


Given a sequence of random numbers, how can one generate a sequence of random 
observations from a given probability distribution? 

For simple discrete distributions, one answer is quite sidat as demonstrated 
by Example 1 in Sec. 23.1. Merely allocate the possible values of a random number 
to the various numbers in the probability distribution in direct proportion to the re- 
spective probabilities of those numbers. For example, consider the probability distri- 
bution of the outcome of a throw of two dice. It is known that the probability of 
throwing a 2 is ṣẹ (as is the probability of throwing a 12), the probability of throwing 
a 3 is ¥g, and so on. Therefore, ¥¢ of the possible values of a random number should 
be associated with throwing a 2, ṣẹ of the values with throwing a 3, and so forth. 
Thus, if two-digit random numbers are being used, 72 of the 100 values would be 
selected for consideration, so that a random number would be rejected if it took on 
any one of the other 28 values. Then two of the 72 possible values (say, 00 and 01) 
would be assigned an association with throwing a 2, four of them (say, 02, 03, 04, 
and 05) would be assigned to throwing a 3, and so on. 

For more complicated distributions, whether discrete or continuous, a general- 
ization of this approach called the inverse transformation method can sometimes be 
used to generate random observations. Letting X be the random variable involved, 
denote the cumulative distribution function by 


l F(x) = P{X = x}. 
Generating each observation then requires the following two steps. 
Summary of Inverse Transformation Method 


1. Generate a uniform random number r between 0 and 1. 
2. Set F(x) = r and solve for x, which then is the desired random observation 
from the probability:distribution. 


This procedure is illustrated in Fig. 23.1 for the case where F(x) is plotted graphically 
and the uniform random numbet,r happens to be 0.5269. 


F(x) 


r = 0.5269 





Random observation 


Figure 23.1 Illustration of the inverse transformation method for obtaining a random observation from 
a given probability distribution. 
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Although the. graphical procedure illustrated by. Fig. 23.1 is convenient if the 
simulation is done manually, the computer must revert to some alternative approach. 
For discrete distributions, a table look-up approach can be used by constructing a 
table that gives the ‘‘range’’ (jump) in the value of F(x) for each value of X = x. 
For certain continuous distributions, the F(x) = r equation can be solved analytically 
for x, as will now be illustrated. 

Consider the exponential distribution (see Sec. 16.4) that has the cumulative 
distribution function 


Fa =1-e™, for x = 0, 


where 1/a@ is the mean of the distribution. Setting F(x) = r thereby yields 


so that COO ey or: 

Therefore, taking the natural logarithm of both sides, 
ln (7%) = h (l — r), 

so that -ax = h(l- r), 


In (1 - 7) 
A A 


—a 


which yields 


as the desired random observation from the exponential distribution. (It should be 
noted that other, more complicated, techniques have also been developed for the 
exponential distribution’ that are faster for a computer than calculating a logarithm.) 

Note that (1 — r) is itself a uniform random number. Therefore, to save a 
subtraction, it is common in practice simply to use the original uniform random 
number r directly in place of (1 — r). 

A natural extension of this procedure for the exponential distribution also can 
be used to generate a random observation from an Erlang (gamma) distribution (see 
Sec. 16.7). The sum of k independent exponential random variables, each with mean 
1/ka, has the Erlang distribution with shape parameter k and mean 1/a. Therefore, 
given a sequence of k random numbers between 0 and 1, say, 71, Fa,- . -> Fg the 
desired random observation from the Erlang distribution is 





k 
In 1 — r;) 
ca -ka ’ 


1 k 
which reduces to x=- ka fi da - r) 
Q i=l 


where IT denotes multiplication. Once again, the subtractions may be eliminated sim- 
ply by using the 7; directly in place of the (1 — r,). 

A particularly simple technique for generating a random observation from a 
normal distribution is obtained by applying the central limit theorem. Because a 
uniform random number has a uniform distribution from 0 to 1, it has mean 4 and 
standard deviation 1/ V12. Therefore, this theorem implies that the sum of n uniform 


' For example, see Ahrens, J. H., and V. Dieter: ‘‘Efficient Table-Free Sampling Methods for Exponential, 
Cauchy, and Normal Distributions,” Communications of the ACM, 31:1330-1337, 1988, 


random numbers has approximately a normal distribution with mean n/2 and standard 
deviation Vn/12. Thus, if r}, Fə, ... , F, are a sample of uniform random numbers, 
then 


3 





Ta n= - 15) 


is a random observation from an approximately normal distribution with mean u and 
standard deviation o. This approximation is an excellent one (except in the tails of 
the distribution), even with small values of n. Thus values of n from 5 to 10 may be 
adequate; n = 12 also is a convenient value, because it eliminates the square root 
terms from the preceding expression. 

Various exact techniques for generating random observations from a normal 
distribution have also been developed.’ These exact techniques are sufficiently fast 
that, in practice, they generally are used instead of the approximate method described 
above. 

A simple method for handling the chi-square distribution is to use the fact that 
it is obtained by summing squares of standardized normal random variables. Thus, if 
Yi Y2; --- > Y, are a sample of random observations from a normal distribution with 
mean 0 and standard deviation 1, such as could be obtained (approximately) by the 
technique just described, then 


is a random observation from a chi-square distribution with n degrees of freedom. 

For many continuous distributions, including the normal and chi-square distri- 
butions, it is not feasible to apply the inverse transformation method because x = 
F~ !(r) cannot be computed (or at least computed efficiently). Therefore, several other 
types of methods have been developed to generate random observations from such 
distributions. Frequently, these methods are considerably faster than the inverse trans- 
formation method even when the latter method can be used. To provide some notion 
of the approach for these alternative methods, we now shall illustrate one called the 
acceptance-rejection method on a simple example. 

Consider the triangular distribution having the probability density function 


x, fOsxsl 
fa) =41-@-)), iflsx=<2 
0, otherwise. 


The acceptance-rejection method uses the following two steps (perhaps repeatedly) to 
generate a random observation. 


1. Generate a uniform random number r, between 0 and 1, and set x = 2r,. 
2. Accept x with 
oF x, fO0sxs1 
Probability = ; 
Lege; flsxs2 


to be the desired random observation. Otherwise, reject x and repeat the two 
steps. 


1 Tbid. 
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To randomly generate the event of accepting (or rejecting) x according to this 
probability, the method implements step 2 as follows: 


2. Generate a uniform random number r, between 0 and 1. 
Accept x ifr, = fx). 
Reject x ifr, > fa). 


If x is rejected, repeat the two steps. 


Because x = 2r, is being accepted with a probability = f(x), the probability distri- 
bution of accepted values has f(x) as its density function, so accepted values are valid 
random observations from f(x). 

We were fortunate in this example that the largest value of f(x) for any x was 
exactly one. If this largest value were L # 1 instead, then r, would be multiplied by 
L in step 2. With this adjustment, the method is easily extended to other probability 
density functions over a finite interval, and similar concepts can be used over an 
infinite interval as well. 


Preparing a Simulation Program 


A number of detailed decisions confront the person who must write the computer 
program for executing a simulation. Although an extensive discussion of these issues 
is beyond the scope of this book, we shall mention several major considerations. 

The basic purpose of most simulation studies is to compare alternatives. There- 
fore, the simulation program must be flexible enough to accommodate readily the 
alternatives that will be considered. Because it often is impossible to predict exactly 
what interesting alternatives will be uncovered during the course of the study, it is 
essential that flexibility and provision for rapid, simple modifications be built into the 
program. 

Most of the instructions in a simulation program are logical operations, whereas 
the relatively little actual arithmetic work required is usually of a very simple type. 
This consideration should be reflected in the choice of computer equipment and pro- 
gramming language to be used. 

The considerations just mentioned actually provided part of the motivation for 
the development of general simulation programming languages. For example, GPSS 
and SIMSCRIPT are two such languages that are widely used. These languages are 
designed especially to expedite the type of programming (and reprogramming) unique 
to simulation. Their specific purposes include the following. One objective is to pro- 
vide a convenient means of describing the elements that commonly appear in simu- 
lation models. A second is to expedite changing the design and operating policies of 
the system being simulated, so that a large number of configurations (including some 
suggested during the course of the study) can be considered easily. Another service 
provided by the simulation languages is some type of internal timing and control 
mechanism, with related commands, to assist in the kind of bookkeeping that is 
required when executing a simulation run. They also are designed to obtain data and 
Statistics conveniently on the aggregate behavior of the system being simulated. Fi- 
nally, these languages provide simple operational procedures, such as introducing 


changes into the simulation model, initializing the state of the model, altering the kind 
of output data to be generated, and stacking a series of simulation runs. 

Por all these reasons a simulation program often should be written in one of 
these simulation languages rather than in a general programming language. The tre- 
mendous savings in programming time ordinarily provided by the simulation languages 
usually compensate for any slight loss in computer running time. 

Finally, it should be emphasized that the strategy of the simulation study should 
be planned carefully before finishing the simulation program. Merely letting the com- 
puter compile masses of data in a blind search for attractive alternatives is far from 
adequate. Simulation basically is a means for conducting an experimental investiga- 
tion. Therefore, just as with a physical experiment, careful attention should be given 
to the construction of a theory of formal hypotheses to be tested and to the skillful 
design of a statistical experiment that will yield valid conclusions. This subject is 
discussed in Secs. 23.3 and 23.4. 


Validating the Model 


The typical simulation model consists of a high number of elements, rules, and logical 
linkages. Therefore, even when the individual components have been carefully tested, 
numerous small approximations can still cumulate into gross distortions in the output 
of the overall model. Consequently, after writing and debugging the computer pro- 
gram, it is important to test the validity of the model for reasonably predicting the 
aggregate behavior of the system being simulated. 

When some form of the real system has already been in operation, its per- 
formance data should be compared with the corresponding output data from the model. 
Standard statistical tests can sometimes be used to determine whether the differences 
in the means, variances, and probability distributions generating the two sets of data 
are Statistically significant. The time-dependent behavior of the data might also be 
compared statistically. If the data are not amenable to statistical analysis, another 
approach is to ask personnel familiar with the behavior of the real system if they can 
discriminate between the two sets of data. 3 

If the model is intended to simulate alternative design configurations or operating 
policies for a proposed system for which no actual data are available, it may be 
worthwhile to conduct a field test to collect some real data to compare with the output 
of the model. Conducting such a test might involve constructing a small prototype of 
some version of the proposed system and placing it into operation. Another possibility 
might be to temporarily alter an existing system to correspond to one of the proposals. 

However, field tests frequently are too expensive and time consuming to be 
used. Without any real data as a standard of comparison, the only way to validate the 
overall model is to have knowledgeable people carefully check the credibility of output 
data for a variety of situations. Even when no basis exists for checking the reason- 
ableness of the data for a single situation, some conclusions usually can be drawn 
about how the relative performance of the system should change as various parameters 
are changed. It is especially important to convince the decision makers of the credi- 
bility of the model, so they will be willing to use it at least to aid their decisions. If 
the model may be used again in the future, careful records of its predictions and of 
actual results should be kept to continue the validation process. 
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Selecting a Statistical Procedure’ 


The underlying statistical theory applicable to simulated experimentation is essentially 
indistinguishable from that for physical experimentation. Thus the design of a simu- 
lated experiment should be based upon the large body of knowledge comprising the 
science of statistics. 

There are, however, differences between physical and simulated experimentation 
regarding the emphasis placed on using the various types of statistical procedures. 
Physical experiments frequently involve testing hypotheses about the value of a pop- 
ulation parameter or about the equality of several population means. Simulated ex- 
periments typically place more emphasis on optimization. It probably is taken for 
granted that alternative design configurations have different population means for the 
measure of performance of the system. Instead, the objective of the simulation study 
often is to find the alternative yielding the largest mean for the measure of per- 
formance.” Hence multiple decision tests and complete or partial ordering procedures 
frequently are appropriate for simulated experiments. Furthermore, sequential pro- 
cedures tend to be useful, both because the evolution of the experiments may be 
difficult to predict and because a simulated experiment often can be resumed relatively 
easily. 

Another difference between these two. types of experiments is the degree to 
which the experimental conditions can be held constant when comparing alternatives. 
Only simulated experiments can control the variability in the behavior of the elements 
of the system during the course of the experiment. By reproducing the same sequence 
of random numbers for each alternative simulated, it often is possible to reproduce 
an identical sequence of events. This reproduction sharpens the contrast between 
alternatives by reducing the residual. variation in the differences in the aggregate 
performance of the system, so that much smaller sample sizes are required to detect 
statistically significant differences. Therefore, this approach usually is far superior to 
generating new random numbers. for each alternative. 

The fact that reproducing the same random numbers does not yield statistically 
independent results should not be of. great concern. The correct procedure for com- 
paring only two alternatives is to pair the results regarding the aggregate performance 
of the system that were produced by the same events. Because these pairs of results 
are obtained under the same experimental conditions, the differences between them 
become the relevant sample observations. This sample would be used to test the 
hypothesis that the mean of these differences is zero and to obtain a confidence interval 
for this mean. This result would thereby indicate whether there is a statistically sig- 
nificant difference between the means of the. measure of performance for the two 
alternatives. If more than two. alternatives need to be compared, the Bonferroni in- 


! This subsection assumes some knowledge of statistical procedures. 


2 In some cases, however, the objective is just to describe the performance of proposed systems or policies 
for management’s evaluation and decision making, so point estimates and confidence intervals probably 
would be obtained. Simulated experiments also are occasionally conducted to determine which factors 
significantly influence the performance of the system (perhaps to guide subsequent experimentation), in 
which case analysis of variance probably would be used. 
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equality can be used to construct simultaneous confidence intervals on the means of 
the differences for the various pairs of alternatives.' 

‘Often it is possible to express the alternatives in terms of the values of one or 
more continuous design variables. In these cases, there actually is an infinite number 
of alternatives (although the differences among some of them are minute). Because it 
would be impossible to simulate all of them, it is necessary to take a selective sample 
of these alternatives and then estimate the value of the design variables that will 
maximize some expected measure of performance for the system. There exists con- 
siderable literature that gives efficient procedures for experimentally determining the 
maximum of a mathematical function to within a specified accuracy.” 


Variance-Reducing Techniques 


Because considerable computer time usually is required for simulation runs, it is 
important to obtain as much and as precise information as possible from the amount 
of simulation that can be done. Unfortunately, there has been a tendency in practice 
to apply simulation uncritically without giving adequate thought to the efficiency of 
the experimental design. This tendency has occurred despite the fact that considerable 
progress has been made in developing special techniques for increasing the precision 
(i.e., decreasing the variance) of sample estimators. 

These variance-reducing techniques often are called Monte Carlo techniques 
(a term sometimes applied to simulation in general). Because they tend to be rather 
sophisticated, it is not possible to explore them deeply here. However, we shall attempt 
to impart the flavor of these techniques and the great increase in precision they some- 
times provide by presenting two of them in the following example. 

Consider the exponential distribution whose parameter has a value of 1. Thus 
its probability density function is f(x) = e~*, as shown in Fig. 23.2, and its cumu- 
lative distribution function is F(x) = 1 — e™™*. It is known that the mean of this 
distribution is 1. However, suppose that this mean were not known and that we want 
to estimate this mean by using simulation. 

To provide a standard of comparison for the two variance-reducing techniques, 
we consider first the straightforward simulation approach, sometimes called the crude 
Monte Carlo technique. This approach involves generating some random observations 
from the exponential distribution under consideration and then using the average of 
these observations to estimate the mean. As described in Sec. 23.2, these random 
observations would be 


x, = —In(l — r), Tors = 2 ase oe W 


where Fj, Fa, .... F, are uniform random numbers between O and 1. If we use a 
portion of Table 23.4 to obtain 10 such uniform random numbers, the resulting random 
observations are shown in Table 23.6. (These same random numbers also are used to 
illustrate the variance-reducing techniques to sharpen the comparison.) 

Notice that the sample average in Table 23.6 is 0.779, as opposed to the true 
mean of 1.000. However, because the standard deviation of the sample average hap- 
1 See Bowker, Albert H., and Gerald J. Lieberman: Engineering Statistics, 2d ed., pp. 304-308, Prentice- 
Hall, Englewood Cliffs, N.J., 1972. 


2A survey of the procedures available for this problem is given by Wilde, Douglass J.: Optimum Seeking 
Methods, Prentice-Hall, Englewood Cliffs, N.J., 1964. 
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0 1 2 3 x 
Figure 23.2 Probability density function for example. 


pens to be 1/ Vn, or 1/10 in this case (as could be estimated from the sample), an 

error of this amount or larger would occur approximately one-half of the time. Fur- 

thermore, because the standard deviation of a sample average is always inversely - 
proportional to Vn, this sample size would need to be quadrupled to reduce this 

standard deviation by one-half. These somewhat disheartening facts suggest the need 

for other techniques that would obtain such estimates: more. precisely and more 

efficiently. 

A relatively simple Monte Carlo technique for obtaining better estimates is 
stratified sampling. There are two shortcomings of the crude Monte Carlo approach 
that are rectified by stratified. sampling. First, by the very nature of randomness, a 
random sample may not provide a particularly uniform cross section of the distribution. 
For example, the random sample given. in Table 23.6 has no observations between 
0.014 and 0.328, even though the probability that a random observation will fall 
inside this interval is greater than 3. Second, certain portions of a distribution may be 
more critical than others for obtaining a precise estimate, but random sampling gives 


Table 23.6 Example for Crude 
Monte Carlo Technique 











Random Random 
Number* Observation 
ti x = —ln(l — r;) 
1 0.495 0.684 
2 0.335 , 0.408 
3 0.791 1.568 i 
4 0.469 0.633 
5 0.279 0.328 
6 0.698 1.199 3 
F 0.013 < 0.014 
8 0.761 1.433 
9 0.290 0.343 
10 0.693 1.183 
Total = 7.793 


Estimate of mean = 0.779 


* Actually, 0.0005 was added to the 
indicated value for each of the r;, so that 
the range of their possible values would 
be from 0.0005 to 0.9995 rather than 
from 0.000 to 0.999. 


no special priority to obtaining observations from these portions. For example, the 
tail of an exponential distribution is especially critical in determining its mean. How- 
ever, the random sample in Table 23.6 includes no observations larger than 1.568, 
even though there is at least a small probability of much larger values. This explanation 
is the basic one for this particular sample average being far below the true mean. 
Stratified sampling circumvents these difficulties by dividing the distribution into por- 
tions called strata, where each stratum would be sampled individually with dispro- 
portionately heavy sampling of the more critical strata. 

To illustrate, suppose that the distribution is divided into three strata in the 
manner shown in Table 23.7. These strata were chosen to correspond to observations 
approximately from 0 to 1, from 1 to 3, and from 3 to infinity, respectively. To ensure 
that the random observations generated for each stratum actually lie in that portion of 
the distribution, the random decimal numbers must be converted into the indicated 
range for F(x), as shown in the third column of Table 23.7. The number of obser- 
vations to be generated from each stratum is given in the fourth column.! The last 
column then shows the resulting sampling weight for each stratum, i.e., the ratio of 
the sampling proportion (the fraction of the total sample to be drawn from the stratum) 
to the distribution proportion (the probability of a random observation falling inside 
the stratum). These sampling weights roughly reflect the relative importance of the 
respective strata in determining the mean. 

Given the formulation of the stratified sampling approach shown in Table 23.7, 
the same random numbers used in Table 23.6 yield the observations given in the fifth 
column in Table 23.8. However, it would not be correct to use the unweighted average 
of these observations to estimate the mean, because certain portions of the distribution 
have been sampled more than others. Therefore, before taking the average, divide the 
observations from each stratum by the sampling weight for that stratum to give pro- 
portionate weightings to the different portions of the distribution, as shown in the last 
column of Table 23.8. The resulting weighted average of 0.948 provides the desired 
estimate of the mean. 

The second variance-reducing technique we shall mention is the method of 
complementary random numbers.” The motivation for this method is that the ‘‘luck 
of the draw” on the uniform random numbers generated may cause the average of 


Table 23.7 Formulation of Stratified Sampling Example 














Se amaaaasasasaasasasassusassssiħÃĂ 
Portion of Stratum Sample Sampling 
Stratum | Distribution Random No. Size Weight 
1 0 < F(x) <0.64 | rl = 0 + 0.64r, 4 gee 
0.64 8 
2 | 0.64 =< FG) = 0.96 | r) = 0.64 + 0.32r, Br Upset 
0.32 4 
3 0.96 = FQ) = 1 r; = 0.96 + 0.047; 2 w = 2/10. =5 
0.04 
' These sample sizes are roughly based on a recommended guideline that they be proportional to the product 
of the probability of a random observation falling inside the corresponding stratum times the standard 





deviation within this stratum. 


? This method is a special case of the method of antithetic variates, which attempts to generate pairs of 
random observations having a high negative correlation, so that the combined average will tend to be closer 
to the mean. 
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Table 23.8. Example for Stratified Sampling 




















Random Stratum Random Sampling 

Number Observation Weight 
$ r; ; x; = —In(l — r}) i xi/w; 
1 | 0.495 3 0.610 
2 0.335 3 0.387 
3 0.791 3 1.131 
4 0.469 $ 0.571 
5 $ 1.045 
6 3 1.596 
7 3 0.826 
8 7 1.723 
9 5 0.712 
0 0.880 
Total = 9.481 
Estimate of mean = 0.948 


the resulting random observations to be substantially on one side of the true mean, 
whereas the complements of those uniform random numbers (which are themselves 
uniform random numbers) would have tended to yield a nearly opposite result. (For 
example, the uniform random numbers in Table 23.6 average less than 0.5, and none 
are as large as 0.8, which led to an estimate substantially below the true mean.) 
Therefore, using both the original uniform random numbers and their complements 
to generate random observations and then calculating the combined sample average 
should provide a more precise estimator of the mean. This approach is illustrated in 
Table 23.9,! where the first three columns come from Table 23.6 and the last two 
columns use the complementary uniform random numbers, which results in a com- 
bined sample average of 0.920. 

This example has suggested that the variance-reducing techniques provide a 
much more precise estimator of the mean than does straightforward simulation. These 
results definitely were not a coincidence, as a derivation of the variance of the esti- 
mators would show. In comparison with straightforward simulation, these techniques 
(including several more complicated ones not presented here) do indeed provide a 
much more precise estimator with the same amount of coniputer time, or they provide 
as precise an estimator with much less computer time. Despite the fact that additional 
analysis may be required to incorporate one or more of these techniques into the 
simulation study, the rewards should not be forgone readily. 

Although this example was a particularly simple one, it is often possible, though 
more difficult, to apply these techniques to much more complex problems. For ex- 
ample, suppose that the objective of the simulation study is to estimate the mean 
waiting time of customers in a queueing system (such as those described in Sec. 17.1). 
Because both the probability distribution of interarrival times and the probability 
distribution of service times are involved, and because consecutive waiting times are 
hot statistically independent, this problem may appear to be beyond the capabilities 
of the variance-reducing techniques. However, as has been described in detail else- 


' It should be noted that 20 calculations of a logarithm were required in this case, in contrast to the 10 that 
were required by each of the preceding techniques. 


Table 23.9 Example for Method of Complementary Random Numbers 
























Random Random Complementary Random 
Number Observation Random Number Observation 
x; In (1 — 7) ro= 1-7, | xX; ln (1 — ri) 
1 0.684 0.505 0.702 
2 0.408 0.665 1.092 
3 1.568 0.209 0.234 
4 0.633 0.531 0.756 
5 0.328 0.721 1.275 
6 1.199 0.302 0.359 
7 0.014 0.987 4.305 
8 1.433 0.239 0.272 
9 0.343 0.710 1.236 
10 1.183 0.307 0.366 
Total: 7.793 10.597 


Estimate of mean = 4(0.779 + 1.060) = 0.920 


where,! these techniques and others can indeed be applied to this type of problem 
very advantageously. For example, the method of complementary random numbers 
can be applied simply by repeating the original simulation run, substituting the com- 
plements of the original uniform random numbers to generate the corresponding ran- 
dom observations. 


Tactical Problems 


There are several special tactical issues that arise in connection with gathering the 
data from simulated experiments. We shall briefly describe these here and then sub- 
sequently elaborate on certain ways of dealing with them. 

Many simulation studies are concerned with investigating systems that operate 
continually in a steady-state condition. Unfortunately, a simulation model cannot be 
operated this way; it must be started and stopped. Because of the artificiality introduced 
by the abrupt beginning of operation, the performance of the simulated system does 
not become representative of the corresponding real-world system until it too has 
essentially reached a steady-state condition (i.e., until the probability distribution of 
the state of the simulated system has essentially reached a limiting equilibrium dis- 
tribution). Thus one tactical problem is how to obtain data that are relevant for pre- 
dicting the steady-state behavior of the real system. 

The traditional way of dealing with this problem is to run the simulation model 
for some time without collecting data until it is believed that the simulated system 
has essentially reached a steady-state condition. Unfortunately, it is difficult to esti- 
mate just how long this stabilization period needs to be. Furthermore, available ana- 
lytical results suggest that a surprisingly long period is required, so that a great deal 
of unproductive computer time must be expended. Section 23.4 presents a statistical 
approach that eliminates these difficulties. 

A related tactical issue is the selection of the starting conditions for the simulated 
system. The traditional recommendation is that the simulated system should be started 


1 Ehrenfeld, S., and S. Ben-Tuvia: ‘‘The Efficiency of Statistical Simulation Procedures,” Technometrics, 
4(2):257-275, 1962. Also see Selected References 7, 8, and 14 for a general discussion of the theory and 
application of various variance-reducing techniques. 
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in a state as representative of steady-state. conditions as possible to. minimize the 
required length of the stabilization period. However, the underlying objective of the 
simulated experiment is to estimate these conditions, so little advance information 
may be available to guide the selection in this way. The procedure in Sec. 23.4 also 
eliminates this difficulty. 

Most statistical sampling procedures assume that the experimental output data 
are in the form of a collection of distinct and statistically independent random obser- 
vations from some underlying probability distribution. By contrast, because of the 
nature of the problems for which simulation is used, the observations from a simulated 
experiment are likely to be highly correlated. For example, there is a high correlation 
between the waiting times of consecutive customers in a queueing system. Further- 
more, many measures of performance are such that the simulated experiment yields 
this measure continuously as a function of time rather than as a sequence of separate 
observations. Thus another tactical problem is how to collect the data so as to cir- 
cumvent these difficulties. 

One traditional method is to execute a series of completely separate and inde- 
pendent simulation runs of equal length and to use the average measure of performance 
for each run (excluding the initial stabilization period) as an individual observation. 
The main disadvantage is that each run requires an initial stabilization period for 
approaching a steady-state condition, so that much of the simulation time is unpro- 
ductive. The second traditional method eliminates this disadvantage by making the 
runs consecutively, using the ending condition of one run as the steady-state starting 
condition for the next run. In other words, one continuous overall simulation run 
(except for the one initial stabilization period) is divided for bookkeeping purposes 
into a series of equal portions (runs). The average measure of performance for each 
portion is then treated as an individual observation. The disadvantage of this method 
is that it does not eliminate the correlation between observations entirely, even though 
it may reduce it considerably by making the portions sufficiently long. 

Once again, these difficulties are eliminated by the statistical approach described 
in the next section. 


23.4 The Regenerative Method of Statistical Analysis 


We have just described several difficult tactical problems in gathering data from 
simulated experiments and the shortcomings of traditional statistical procedures in 
dealing with these problems. We now present an innovative statistical approach that 
is especially designed to eliminate these problems. 

The basic concept underlying this approach is that for many systems a simulation 
run can be divided into a series of cycles such that the evolution of the system in a 
cycle is a probabilistic replica of the evolution in any other cycle. Thus, if we calculate 
an appropriate measure of the length of the cycle along with some statistic to sum- 
marize the behavior of interest within each cycle, these statistics for the respective 
cycles constitute a series of independent and identically distributed observations that 
can be analyzed by standard statistical procedures. Because the system keeps going 
through these independent and identically distributed cycles whether or not it is in a 
steady-state condition, these observations are directly applicable from the outset for 
estimating the steady-state behavior of the system. 


For cycles to possess these properties, they must each begin at the same regen- 
eration point, i.e., at the point where the system probabilistically restarts, and can 
proceed without any knowledge of its past history. The system can be viewed as 
regenerating itself at this point in the sense that the probabilistic structure of the future 
behavior of the system depends upon being at this point and not on anything that 
happened previously. (This property is the Markovian property described in Sec. 15.3 
for Markov chains.) A cycle ends when the system again reaches the regeneration 
point (when the next cycle begins). Thus the length of a cycle is just the elapsed time 
between consecutive occurrences of the regeneration point, which is a random variable 
that depends upon the evolution of the system. 

When next-event incrementing is used, a typical regeneration point is a point at 
which an event has just occurred but no future events have yet been scheduled. Thus 
nothing needs to be known about the history of previous schedulings, and the simu- 
lation can start from scratch in scheduling future events. When fixed-time incrementing 
is used, a regeneration point is a point at which the probabilities of possible events 
occurring during the next unit of time do not depend upon when any past events 
occurred, but only on the current state of the system. 

Not every system possesses regeneration points, so this regenerative method 
of collecting data cannot always be used. Furthermore, even when there are regen- 
eration points, the one chosen to define the beginning and ending points of the cycles 
must recur frequently enough so that a substantial number of cycles will be obtained 
with a reasonable amount of computer time.' Thus some care must be taken to choose 
a suitable regeneration point. 

Perhaps the most important application of the regenerative method to date has 
been to the simulation of queueing systems, including queueing networks (see Sec. 
16.9) such as the ones that arise in computer modeling.” 


Example 


Suppose that information needs to be obtained about the steady-state behavior of a 
system that can be formulated as a single-server queueing system (see Sec. 16.2). 
However, both the interarrival and service times have a discrete uniform distribution 
with a probability of ty of the values of 6, 8, . . . , 24 and the values of 1,3,..., 
19, respectively. Because analytical results are not available, simulation with next- 
event incrementing is to be used to obtain the desired results. 

Except for the distributions involved, the general approach is the same as de- 
scribed in Sec. 23.1 for Example 2. In particular, the building blocks of the simulation 
model are the same as specified there, including defining the state of the system as 
the number of customers in the system. Suppose that one-digit random numbers are 
used to generate the random observations from the distributions, as shown in Table 
23.10. Beginning the simulation run with zero customers in the system then yields 


l The theoretical requirements for the method are that the expected cycle length be finite and that the 
number of cycles would go to infinity if the system continued operating indefinitely. 

? See, for example, Iglehart, Donald L., and Gerald S. Shedler: Regenerative Simulation of Passage Times 
in Networks of Queues, Lecture Notes in Control and Information Sciences, Vol. 4, Springer-Verlag, New 
York, 1980. For another exposition that emphasizes applications to computer system modeling, see Shedler, 
G. S.: Regeneration and Networks of Queues, Springer-Verlag, New York, 1987. 
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Table 23.10 Correspondence 
between Random Numbers and 
Random Observations for 
Queueing System Example 


Random | Interarrival | Service 
Number Time Time 





the results summarized in Table 23.11 and Fig. 23.3, where the random numbers are 
obtained sequentially as needed from the tenth row of Table 23.4.! 

For this system, one regeneration point is where an arrival occurs with no 
previous customers left. At this point, the process probabilistically restarts, so the 
probabilistic structure of when future arrivals and service completions will occur is 
completely independent of any previous history. The only relevant information is that 
the system has just entered the special state of having had zero customers and having 
the time until the next arrival reach zero. The simulation run would not previously 
have scheduled any future events but would now generate both the next interarrival 
time and the service time for the customer that just arrived. 

The only other regeneration points for this system are where an arrival and a 
service completion occur simultaneously, with a prespecified number of customers in 
the system. However, the regeneration point described in the preceding paragraph 
occurs much more frequently and thus is a better choice for defining a cycle. With 
this selection, the first five complete cycles of the simulation run are those shown in 
Fig. 23.3. (In most cases, you should have a considerably larger number of cycles in 
the entire simulation run in order to have sufficient precision in the statistical analysis.) 

Various types of information about the steady-state behavior of the system can 
be obtained from this simulation run, including point estimates and confidence inter- 
vals for the expected number of customers in the system, the expected waiting time, 
and so on. In each case, it is necessary to use only the corresponding statistics from 
the respective cycles and the lengths of the cycles. We shall first present the general 
statistical expressions for the regenerative method and then apply them to this example. 


Statistical Formulas 


Formally speaking, the statistical problem for the regenerative method is to obtain 
estimates of the expected value of some random variable X of interest. This estimate 
is to be obtained by calculating a statistic Y for each cycle and an appropriate measure 
Z of the size of the cycle such that 


BY) 


E(X) = ao 


' When both an interarrival time and a service time need to be generated at the same time, the interarrival 
time is obtained first. 
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Simulation 
Number 

of Random Next Service 

Time | Customers | Number Completion 
0 0 Ji 9 — 
24 1 2,6 37 
34 2 4 37 
37 1 6 50 
48 2 4 50 
50 1 1 53 
53 0 — — 
62 1 bel 65 
65 0 — — 
70 1 339 89 
82 2 1 89 
89 1 4 98 
90 2 i 98 
98 2 1,5 106 109 
106 3 6 124 109 
109 2 2 124 114 
114 1 1 124 117 
117 0 — 124 — 
124 1 5,6 140 137 
137 0 — 140 — 
140 1 9,3 164 147 

0 
1 











(The regenerative property ensures that such a ratio formula holds for many steady- 
state random variables X.) Thus, if n complete cycles are generated during the sim- 
ulation run, the data gathered are Y,, Y,,..., Y, and Zi, Z,, ..., Z, for the 
respective cycles. 

Letting Y and Z, respectively, denote the sample averages for these two sets of 
data, the corresponding point estimate of E(X) would be obtained from the formula 


Est {E(X)} = 


NIII 


Number of customers 
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Figure 23.3 Outcome of the simulation run for the queueing system example. 
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To obtain a confidence interval for E(X), we must first calculate several quan- 
tities from the data. These quantities include the sample variances 
n 
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and the combined sample covariance 
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Also let s = s —2 (2) sy + (2) S35. 

fl 5) 2 z) 2 


Finally, let œ be the constant such that (1 — 2a) is the desired confidence coefficient 
for the confidence interval, and look up K, in Table A5.1 (see Appendix 5) for the 
normal distribution. If n is not too small, an asymptotic confidence interval for E(X) 
is then given by 


I 








= HS ens fg Bas 
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that is, the probability is approximately (1 — 2a) that the endpoints of an interval 
generated in this way will surround the actual value of E(X). 


ae 


Application of the Statistical Formulas to the Example 


Consider first how to estimate the expected waiting time for a customer before begin- 
ning service (denoted by W, in Chap. 16). Thus the random variable X now would 
represent a customer’s waiting time excluding service, so that 


W, = E(X). 


The corresponding information gathered during the simulation run is the actual waiting 
time (excluding service) incurred by the respective customers. Therefore, for each 
cycle, the summary statistic Y would be the sum of the waiting times, and the size of 
the cycle Z would be the number of customers, so that 


_ EY) 
1 EZY 
For cycle 1, a total of three customers are processed, so Z} = 3. The first 


customer incurs no waiting before beginning service, the second waits 3 units of time 
(from 34 to 37), and the third waits 2 units of time (from 48 to 50), so Y, = 5. We 


proceed similarly for the other cycles. The data for the problem are 


Y= 5, Z =3 
Y, = 0, Z =1 
Y, = 34, Z, = 5 
Y, = 0, Z,=1 
Y, = 0, Z;=1 


Y= 728, Z = 2.2. 


Therefore, the point estimate of W, is 


~ 
oo 


L 


Est {W,} = 


— 


NI i 
i 
N 


To obtain a 95 percent confidence interval for W,, the preceding formulas are 
first used to calculate 


s? = 219.20, s, = 3.20, s% = 24.80, 9 = 9.14. 


Because (1 — 2a) = 0.95, then a = 0.025, so that K, = 1.96 from Table A5.1. 
The resulting confidence interval is 


—0.09 = W, = 7.19; 
thatis, W, = 7.19. 


The reason that this confidence interval is so wide (even including impossible 
negative values) is that the number of sample observations (cycles), n = 5, is so 
small. Note in the general formula that the width of the confidence interval is inversely 
proportional to the square root of n, so that, e.g., quadrupling n reduces the width 
by half (assuming no change in s or Z). Given preliminary values of s and Z from a 
short preliminary simulation run (such as the run in Table 23.11), this relationship 
makes it possible to estimate in advance the width of the confidence interval that 
would result from any given choice of n for the full simulation run. The final choice 
of n can then be made based on the trade-off between computer time and the precision 
of the statistical analysis. 

Now suppose that this simulation run is to be used to estimate Po, the probability 
of having zero customers in the system. [The theoretical value is known to be Py = 
1 — à/u = 1 — (s)/Gs) = 3.] The corresponding information obtained during the 
simulation run is the fraction of time during which the system is empty. Therefore, 
the summary statistic Y for each cycle would be the total time during which no 
customers are present, and the size Z would be the length of the cycle, so that 


_ BY) 


The length of cycle 1 is 38 (from 24 to 62), so that Z, = 38. During this time, 
the system is empty from 53 to 62, so that Y, = 9. Proceeding in this manner for the 
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other cycles, the following data are obtained for the problem: 


Y, = 9, Z = 38 
Y= 5, Z= 8 
n= 7, Z, = 54 
Y, = 3, Z, = 16 
Ys = 17, Z, = 24 


Ý = 82, Z =2. 
Thus the point estimate of Po is 


8.2 
Est {Po} = zg = 0.293. 


By calculating 
s?, = 29.20, 53. = 334, si, = 17, s= 6.92, 
a 95 percent confidence interval for P} is found to be 
0.076 = Py = 0.510. 


(The wide range of this interval indicates that a much longer simulation run would be 
needed to obtain a relatively precise estimate of Po.) 

If we redefine Y appropriately, the same approach also can be used to estimate 
other probabilities involving the number of customers in the system. However, because 
this number never exceeded 3 during this simulation run, a much longer run will be 
needed if the probability involves larger numbers. 

The other basic queueing theory expected values defined in Sec. 16.2 (W, L,, 
L) can be estimated from the estimate of W, by using the relationships among these 
four expected values given near the end of Sec. 16.2. However, they can also be 
estimated directly from the results. of the simulation run. For example, because the 
expected number of customers waiting to be served is 


L= È a- DP» 


n=2 


it can be estimated by defining 


oo 


Y= x, (n- DT, 


where T, is the total time that exactly n customers are in the system during the cycle. 
(This definition of Y actually is equivalent to the definition used when estimating W,.) 
In this case, Z would be defined as it would be when estimating any P,, namely, the 
length of the cycle. The resulting point estimate of L, then turns out to be simply the 
point estimate of W, multiplied by the actual average arrival rate for the complete 
cycles observed. 

It is also possible to estimate higher moments of these probability distributions 
by redefining Y accordingly. For example, the second moment about the origin of the 


number of customers waiting to be served (N,), 887 


æ Simulation 
EN) = >) (n — 1)°P,, 
n=2 


can be estimated by redefining 


Y= Ý @— 1IPT, 
n=2 
This point estimate, along with the point estimate of L, (the first moment of N,) just 
described, can then be used to estimate the variance of N, Specifically, because of 
the general relationship between variance and moments, this variance is 


Var (N) = EN) — L?. 


Therefore, its point estimate is obtained by substituting in the point estimates of the 
quantities on the right-hand side of this relationship. 

Finally, we should mention that it was unnecessary to generate the first inter- 
arrival time (24) for the simulation run summarized in Table 23.11 and Fig. 23.3 
because this time played no role in the statistical analysis. It is more efficient with 
the regenerative method just to start the run at the regeneration point. 

Selected References 5 and 14 (chap. 6) provide considerably more information 
about the regenerative method, including how it can be applied to more complicated 
kinds of problems than those considered here. 


23.5 Conclusions 


Simulation is a widely used tool for estimating the performance of complex stochastic 
systems if contemplated designs or operating policies were to be used. 

We have focused in this chapter on the use of simulation for predicting the 
steady-state behavior of systems whose states change only at discrete points in time. 
However, by having a series of runs begin with the prescribed starting conditions, 
we can also use simulation to describe the transient behavior of a proposed system. 
Furthermore, if we use differential equations, simulation can be applied to systems 
whose states change continuously with time. 

Simulation is indeed a very versatile tool. However, it is by no means a panacea. 
Simulation is inherently an imprecise technique. It provides only statistical estimates 
rather than exact results, and it compares alternatives rather than generating an optimal 
one. Furthermore, simulation is a slow and costly way to study a problem. It usually 
requires a large amount of time and expense for analysis and programming, in addition 
to considerable computer running time. Simulation models tend to become unwieldy, 
so that the number of cases that can be run and the accuracy of the results obtained 
often turn out to be very inadequate. Finally, simulation yields only numerical data 
about the performance of the system, so that it provides no additional insight into the 
cause-and-effect relationships within the system except for the clues that can be 
gleaned from these numbers (and from the analysis required to construct the simulation 
model). Therefore, it is very expensive to conduct a sensitivity analysis of the param- 
eter values assumed by the model. The only possible way would be to conduct new 
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series of simulation runs with different parameter values, which would tend to provide 
relatively little information at a relatively high cost. 

Simulation provides a way of experimenting with proposed systems or policies 
without actually implementing them. Sound statistical theory should be used in de- 
signing these experiments. Surprisingly long simulation runs often are needed to obtain 
Statistically significant results. However, variance-reducing techniques can be very 
helpful in reducing the length of the runs needed. 

Several tactical problems arise when we apply traditional statistical estimation 
procedures to simulated experiments. These problems include prescribing appropriate 
starting conditions, determining when a steady-state condition has essentially been 
reached, and dealing with statistically dependent observations. These problems can 
be eliminated by using the regenerative method of statistical analysis. However, there 
are some restrictions on when this method can be applied. 

Simulation unquestionably has an important place in the theory and practice of 
operations research. It is an invaluable tool for use on those problems where analytical 
techniques are inadequate. 
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PROBLEMS 


(Random numbers needed to do these problems manually should be obtained from Table 23.4. 
For each part, use the digits consecutively starting from the front of the top row to form three- 
digit random numbers 096, 569, 665, and so on.) 


1.* Use the mixed congruential method to generate the following sequences of random 
numbers: 

(a) A sequence of 10 one-digit random numbers such that x; = (, + 3)(modulo 10) 
and x = 2. 

(b) A sequence of eight random numbers between 0 and 7 such that x,,, = (Sx, + 1) 
(modulo 8) and x) = 1. 

(c) A sequence of five two-digit random numbers such that x, , , = (61x, + 27)(modulo 
100) and x) = 10. 


2. Use the mixed congruential method to generate a sequence of five two-digit random 
numbers such that x,,, = (41x, + 33)(modulo 100) and xy = 48. 


3. Use the mixed congruential method to generate the following sequences of random 
numbers: 
(a) A sequence of five random numbers between O and 31 such that x,., = (13x, + 
15)(modulo 32) and x = 14. 
(b) A sequence of three three-digit random numbers such that x,,, = (201x, + 
503)(modulo 1,000) and x) = 485. 


4.* Use the one-digit random numbers—5, 2, 4, 9, 7—to generate random observations 
for each of the following situations: 
(a) Throwing an unbiased coin. 
(b) Throwing a die. 
(c) The color of a traffic light found by a randomly arriving car when it is green 40 
percent of the time, yellow 10 percent of the time, and red 50 percent of the time. 


5. Generate five random observations from a uniform distribution between — 10 and 
+40, 


6.* Suppose that random observations are needed from the triangular distribution whose 
probability density function is 
2x, fosx=i1 
f@) = 


0, otherwise. 


(a) Derive an expression for each random observation as a function of the uniform 
random number r. 
(b) Generate five random observations. 


7. Generate three random observations from each of the following probability distri- 
butions: 

(a) The uniform distribution from 25 to 75. 

(b) The distribution whose probability density function is 


x(x + 17, if-lsx<1 
fx) = . 
0, otherwise. 
(c) The distribution whose probability density function is 


zoa(x — 40), if 40 <x = 60 
fx) = me, Wee 


; otherwise. 
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8. Generate three random observations from each of the following probability distri- 
butions: 
(a) The random variable X has P{X = 0} = $. Given X # 0, it has a uniform distribution 
between —5 and 15. 
(b) The distribution whose probability density function is 


#6) ie ifls<x=2 
bs Ses. 
3. x; if2<x=s3. 


(c) The geometric distribution with parameter p = 3, so that 
OTI fk = 1,2,... 


0, otherwise. 


px == { 


9, Generate three random observations from a normal distribution with mean = 10 and 
standard deviation = 5. (Use n = 3 for each observation.) 


10. Generate four random observations from a normal distribution with mean = 0 and 
standard deviation = 1. (Use n = 3 for each observation.) Then use these four observations 
to generate two random observations from a chi-square distribution with 2 degrees of freedom. 


11.* Generate two random observations from each of the following probability distri- 
butions: 
(a) The exponential distribution with mean = 4. 
(b) The Erlang distribution with mean = 4 and shape parameter k = 2 (that is, standard 
deviation = 2V2). 
(© The normal distribution with mean = 4 and standard deviation = 2V2. (Use 
n = 6 for each observation.) 


12. Generate four random observations from an exponential distribution with mean = 
1. Then use these four observations to generate one random observation from an Erlang dis- 
tribution with mean = 4 and shape parameter k = 4. 


13. Use the acceptance-rejection method to generate three random observations from 
the triangular distribution used to illustrate this method in Sec. 23.2. 


14. Use the acceptance-rejection method to generate three random observations from 
the probability density function 
1 . 
so(x — 10), if 10 =x < 20 
0, otherwise: 


15. The weather can be considered a stochastic system, because it evolves in a proba- 
bilistic manner from one day to the next. Suppose for a certain location that this probabilistic 
evolution satisfies the following description: 

The probability of rain tomorrow is 0.6 if it is raining today. 

The probability of being clear (no rain) tomorrow is 0.8 if it is clear today. 

Simulate the evolution of the weather for 10 days, beginning the day after a clear day. 


16. The game of craps requires the player to throw two dice one or more times until a 
decision has been reached as to whether he wins or loses. He wins if the first throw results in 
a sum of 7 or 11, or, alternatively, if the first sum is 4, 5, 6, 8, 9, or 10 and the same sum 
reappears before a sum of 7 has appeared. Conversely, he loses if the first throw results in a 
sum of 2, 3, or 12, or, alternatively, if the first sum is 4, 5, 6, 8, 9, or 10 and a sum of 7 
appears before the first sum reappears. 

(a) Simulate five plays of this game to start the process of estimating the probability of 

winning. 

(b) For a large number of plays of the game, the proportion of wins has approximately 

a normal distribution with mean = 0.493 and standard deviation = 0.5/Vn. Use 


this information to calculate the number of simulated plays that would be required 
to have a probability of at least 0.95 that the proportion of wins will be less than 
0.5. 


17. Consider the M/M/1 queueing theory model that was discussed in Sec. 16.6 and 
Example 2, Sec. 23.1. Suppose that the mean arrival rate is 5 per hour, the mean service rate 
is 10 per hour, and you are required to estimate the expected waiting time before service begins 
by using simulation. 

(a) Starting with the system empty, use next-event incrementing to perform the simu- 

lation until two service completions have occurred. 

(b) Starting with the system empty, use fixed-time incrementing (with 2 minutes as the 
time unit) to perform the simulation until two service completions have occurred. 

(c) Write a computer simulation program with next-event incrementing for this problem. 
Use the regenerative method with 100 cycles to obtain a point estimate and 95 
percent confidence interval for the steady-state expected waiting time before service 
begins. Compare these results with the theoretical value. 

(d) Write a computer simulation program with fixed-time incrementing and 0.1 minute 
as the time unit. Use the regenerative method with 100 cycles to obtain a point 
estimate and 95 percent confidence interval for the steady-state expected waiting 
time before service begins. Compare these results with the theoretical value. 


18.* Consider the probability distribution whose probability density function is 


L flsxs œo 
f(x) = 
0, otherwise. 


The problem is to perform a simulated experiment, with the help of variance-reducing tech- 
niques, for estimating the mean of this distribution. To provide a standard of comparison, also 
derive the mean analytically. 

For each of the following cases, generate 10 observations and calculate the resulting 
estimate of the mean: 

(a) Use the crude Monte Carlo method. 

(b) Use stratified sampling with three strata: 0 = F(x) = 0.6, 0.6 < Fœ = 0.9, 

0.9 < F(x) = 1, with 3, 3, and 4 observations, respectively. 
(c) Use the method of complementary random numbers. 


19. One product produced by a certain company requires that bushings be drilled into 
a metal block and that cylindrical shafts be inserted into the bushings. The shafts are required 
to have a radius of at least 1.0000 inch, but the radius should be as little larger than this as 
possible. In actuality, the probability distribution of what the radius of a shaft will be (in inches) 
has the probability density function 


Fo PA 120000) if x = 1.0000 
Ss X TA . 
> otherwise. 


Similarly, the probability distribution of what the radius of a bushing will be (in inches) has 
the probability density function 


flo) Pak if 1.0000 = x = 1.0100 

X = 

a 0, otherwise. 

The clearance between a bushing and a shaft is the difference in their radii. Because they are 
selected at random, there occasionally is interference (i.e., negative clearance) between a bush- 
ing and a shaft that were to be mated. The objective is to determine how frequently this 
interference will happen under the current probability distributions. 
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Perform a simulated experiment for estimating the probability of interference. Notice 
that almost all cases of interference will occur when the radius of the bushing is much closer 
to 1.0000 inch than to 1.0100 inches. Therefore, it appears that an efficient experiment would 
generate most of the simulated bushings from this critical portion of the distribution. Take this 
observation into account in part (b). For each of the following cases, generate 10 observations 
and calculate the resulting estimate of the probability of interference: 

(a) Use the crude Monte Carlo method. 

(b) Develop and apply a stratified sampling approach to this problem. 

(c) Use the method of complementary random numbers. 


20. Simulation is being used to study a system whose measure of performance X will 
be partially determined by the outcome of a certain external factor. This factor has three possible 
outcomes (unfavorable, neutral, and favorable) that will occur with equal probability (3). Be- 
cause the favorable outcome would greatly. increase the spread of possible values of X, this 
outcome is more critical than the others for estimating the mean and variance of X. Therefore, 
a stratified sampling approach has been adopted, with six random observations of the value of 
X generated under the favorable outcome, three generated under the neutral outcome, and one 
generated under the unfavorable outcome—as follows: 


Outcome of Simulated 
External Factor Values of X 
Favorable 8,5, 1,6, 3,7 
Neutral 3,5,2 
Unfavorable 2 





(a) Develop the resulting estimate of E(X). 
(b) Develop the resulting estimate of E(X’). 


21.* A certain single-server system has been simulated, with the following sequence of 
waiting times before service for the respective customers. Use the regenerative method to obtain 
a point estimate and 90 percent confidence interval for the steady-state expected waiting time 
before service. 

(a) 0, 5, 4, 0, 2, 0, 3, 1, 6, 

(b) 0, 3, 2, 0, 3, 1, 5, 0, 0, 


22. Consider the queueing system example presented in Sec. 23.4 for the regenerative 
method. Explain why the point where a service completion occurs with no other customers left 
is not a regeneration point. 


23. A company has been having a maintenance problem with a certain complex piece 
of equipment. This equipment contains four identical vacuum tubes that have been the cause 
of the trouble. The problem is that the tubes fail fairly frequently, thereby forcing the equipment 
to be shut down while a replacement is made. The current practice is to replace tubes only 
when they fail. However, a proposal has been made to replace all four tubes whenever any one 
of them fails to reduce the frequency with which the equipment must be shut down. The 
objective is to compare these two alternatives on a cost basis. 

The pertinent data are the following. For each tube, the operating time until failure has 
approximately a uniform distribution from 1,000 to 2,000 hours. The equipment must be shut 
down for 1 hour to replace one tube or for 2 hours to replace all four tubes. The total cost 
associated with shutting down the equipment and replacing tubes is $100/hour plus $20 for 
each new tube. 

(a) Starting with four new tubes, simulate the operation of the two alternative policies 

for 5,000 hours of simulated time. 

(b) Use the data from part (a) to make a preliminary comparison of the two alternatives 

on a cost basis. 


(c) For the proposed policy, describe an appropriate regeneration point for defining 
cycles that will permit applying the regenerative method of statistical analysis. Ex- 
plain why the regenerative method cannot be applied to the current policy. 

(d) For the proposed policy, use the regenerative method to obtain a point estimate and 
95 percent confidence interval for the steady-state expected cost per hour from the 
data obtained in part (a). 

(e) Write a computer simulation program for the two alternative policies. Then repeat 
parts (a), (b), and (d) on the computer, with 100 cycles for the proposed policy and 
55,000 hours of simulated time (including a stabilization period of 5,000 hours) for 
the current policy. 


24. A manufacturing company has two planers for cutting flat surfaces in large work 
pieces of two different types. The time required to perform each job varies somewhat, depending 
largely upon the number of passes that must be made. In particular, for both types of work 
pieces, the time required by a planer has approximately the following probability distribution: 





Time (min) Probability 





10 0.30 
20 0.25 
30 0.18 
40 0.12 
50 0.08 
60 0.045 
70 0.015 
80 0.007 
90 0.003 


Every half hour one work piece of both types is brought to the planer department. 

Unfortunately, the planer department has had a difficult time keeping up with its work- 
load. Frequently there are a number of work pieces waiting for a free planer. This waiting has 
seriously disrupted the production schedule for the subsequent operations, thereby greatly in- 
creasing the cost of in-process inventory as well as the cost of idle equipment and resulting 
lost production. Therefore, a proposal has been made to obtain one additional planer to relieve 
this bottleneck. 

It is estimated that the total incremental cost (including capital recovery cost) associated 
with obtaining and operating another planer would be $30/hour. (This estimate takes into 
account the fact that, even with an additional planer, the total running time for all the planers 
will remain the same.) It is also estimated that the total cost associated with work pieces having 
to wait to be processed is $200 per work piece per hour and $100 per work piece per hour for 
work pieces of the first and second types, respectively. Because of this difference in costs, 
work pieces of the first type always are given priority over those of the second type. In other 
words, if a planer becomes free when work pieces of both types are waiting, a work piece of 
the first type always is chosen to be processed next. 

(a) Starting with all planers idle waiting for work pieces to arrive momentarily, use 
next-event incrementing to simulate the operation of the two alternative policies (the 
status quo or obtaining one additional planer) for 3 hours of simulated time. 

(b) Describe an appropriate regeneration point for defining cycles that will permit ap- 
plying the regenerative method of statistical analysis to this problem. 

(c) Write a computer simulation program for the two alternative policies. Use the re- 
generative method with 100 cycles each to compare the two alternatives on a cost 
basis. 


25. Select any of the typical applications of simulation listed at the end of Sec. 23.2 
and develop a simulation model for this type of problem. 
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Appendix 


Convexity 


The concept of convexity is frequently used in operations research work. Therefore, we introduce 
the properties of convex (or concave) functions and convex sets. 


Definition: A function of a single variable, f(x), is a convex function if, for each pair 
of values of x, say, x’ and x", 


FAR" + CL = Ada’) S AFR") + (= AFH’) 


for all values of A such that 0 = A = 1. It is a strictly convex function if = can be 
replaced by <. It is a concave function (or a strictly concave function) if this statement 
holds when = is replaced by = (or by >). 


This definition has an enlightening geometric interpretation. Consider the graph of the 
function f(x) drawn as a function of x. Then [x’, f(x’)] and [x”, f(x”)] are two points on the 
graph of f(x), and [Ax” + (1 — A)x’, Af”) + (1 — A)f@’)] represents the various points 
on the line segment between these two points when 0 = A = 1. Thus the original inequality 
in the definition indicates that this line segment lies entirely above or on the graph of the 
function. Therefore, f(x) is convex if, for each pair of points on the graph of f(x), the line 
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segment joining these two points lies entirely above or on the graph of f(x). In other words, 
f(@) is convex if it is ‘‘always bending upward.’’ (This condition is sometimes referred to as 
““concave upward,” as opposed to “‘concave downward’’ for a concave function.) To be more 
precise, if f(x) possesses a second derivative everywhere, then f(x) is convex if and only if 
d?f(x)/dx? = 0 for all values of x [for which f(x) is defined]. Similarly, f(x) is strictly convex 
when d7f(x)/dx* > 0, concave when d?f(x)/dx? = 0, and strictly concave when d?f(x)/dx? 
< 0. Some examples are given in Figs. Al.1 to Al.4. 

The concept of a convex function also generalizes to functions of more than one variable. 
Thus if f(x) is replaced by f(x,, x», . - . , Xa), the definition just given still applies if x is 
replaced everywhere by (x,, X,,..., Xp). Similarly, the corresponding geometric. interpreta- 
tion is still valid after generalizing the concepts of points and line segments. Thus, just as a 
particular value of (x, y) is interpreted as a point in two-dimensional space, each possible value 
of (x,, X2,...,,,) may be thought of as a point in m-dimensional (Euclidean) space. By 
letting m = n + 1, the points on the graph of f(x), x2, . . . , Xp) become the possible values 
of X, X2, o ~~ Xp» FX Xa, < o- , X,)]- Another point, (x,, Xa; ©- - x Xps %,+1). is said to lie 
above, on, or below the graph of f(x,, x, ... , Xa), according to whether x,,,, is larger, equal 
to, or smaller than f(x,, X2, ..., X,), respectively. 


Definition: The line segment joining any two points (x,,x5,...,x,) and 

(xj, Xa . . . , Xn) is the collection of points. 

i Xo, ee ey Xp) = [Ax + CL — AM, Ax + I A, Ae t= AD] 

such thatO = A= 1. 

Thus a line segment in m-dimensional space is a direct generalization of a line segment 
in two-dimensional space. For example, if 

@ xa) = 2,6, Eix) = 8B, 4, 
then the line segment joining them is the collection of points, 
(X41, x) = [BA + 211 — A), 4A + 6 — à) 


whereO =A 1. 


Definition: f(x,, x2, - . . , X,) is a convex function if, for each pair of points on the 
graph of f(x,, x.,..., X,), the line segment joining these two points lies entirely above 
or on the graph of f(x,, x, ... , x,). It is a strictly convex function if this line segment 


actually lies entirely above this graph except at the endpoints of the line segment. Con- 
cave functions and strictly concave functions are defined in exactly the same way, 
except that above is replaced by below. 


SO) 





Figure A1.1 A convex function. Figure A1.2 A concave function. 


f(x) f(x) 


x x 


Figure A1.3 A function that is both convex Figure A1.4 A function that is neither 
and concave. convex nor concave. 


Just as the second derivative can be used (when it exists everywhere) to check whether 
a function of a single variable is convex, so second partial derivatives can be used to check 
functions of several variables, although in a more complicated way. For example, if there are 
two variables, then f(x,, x2) is convex if and only if 


PF Xo) If x2) Ee za > 





(1) əx? ax3 Ox, AX, 
Pf, X2) 

2 = = 0, 

@) ax? 

and 

(3) 3f, X2) =0, 


ax3 


for all possible values of (x,, x»), assuming that these partial derivatives exist everywhere. It 
is strictly convex if = can be replaced by > in all three conditions [but now condition (3) is 
superfluous and can be omitted because it is implied by the other two conditions], whereas 
f(x, X2) is concave if = can be replaced by = in conditions (2) and (3). When there are more 
than two variables, the conditions for convexity are a generalization of the ones just shown. In 
mathematical terminology, f(x,, X2, . . - , X,) is convex if and only if its n X n Hessian matrix 
is positive semidefinite for all possible values of (x1, x2, ©.. > Xp). 

Thus far convexity has been treated as a general property of a function. However, many 
nonconvex functions do satisfy the conditions for convexity over certain intervals for the re- 
spective variables. Therefore, it is meaningful to talk about a function being convex over a 
certain region. For example, a function is said to be convex within a neighborhood of a specified 
point if its second derivative or partial derivatives satisfy the conditions for convexity at that 
point. This concept is useful in Appendix 2. 

Finally, two particularly important properties of convex functions should be mentioned. 
First, if f(%j,%,---.,%,) is a convex function, then g(x,,x%,...,%x,) = 
— f&s X,...,%,) is a concave function, and vice versa. Second, the sum of convex func- 
tions is a convex function. To illustrate, 


fie) = x; a 2x} — 5x, 
and folXy, X2) = XZ + Qxyx. + x3 


are both convex functions, as you can verify by calculating their second derivatives. Therefore, 
the sum of these functions, 


Fy, Xo) = x$ + 3x? — 5x, + 2xyx, + x3, 
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Figure A1.6 Example of a convex set 


Figure A1.5 Example of a convex set determined by a concave function. 


determined by a convex function. 


is a convex function, whereas its negative, 
T 
g&n x) = xf — 3x? + 5x, = 2x — x5, 


is a concave function. 

The concept of a convex function leads quite naturally to the related concept of a convex 
set. Thus, if f(x), %,...,%,) is a convex function, then the collection of points that lie 
above or on the graph of f(x,, X2, . . . , %,) forms a convex set. Similarly, the collection of 
points that lie below or on the graph of a concave function is.a convex set. These cases are 
illustrated in Figs. A1.5 and A1.6 for the case of a single independent variable. Furthermore, 
convex sets have the important property that, for any given group of convex sets, the collection 
of points that lie in all of them (i.e., the intersection of these convex sets) is also a convex set. 
Therefore, the collection of points that lie both above or on a convex function and below or 
on a concave function is a convex set, as illustrated in Fig. Al.7. Thus convex sets may be 
viewed intuitively as a collection of points whose bottom boundary is a convex function and 
whose top boundary is a concave function. To.be a bit more: precise, a convex set may be 
defined as follows: 


Definition: A convex set is a collection of points such that, for each pair of points in 
the collection, the entire line segment joining these two points is also in the collection. 


The distinction between nonconvex sets and convex sets is illustrated in Figs. A1.8 and 
A1.9. Thus the set of points shown in Fig. Al.8 is not a convex set because there exist many 
pairs of these points, for example, (1, 2) and (2, 1), such that the line segment between them 


Xe X2 

xX, l 2 x, 
Figure A1.7 Example of a convex set Figure A1.8 Example of a set that is 
determined by both convex and concave not convex. 


functions. 


Xz 


0 l 2 X 


Figure A1.9 Example of a convex set. 


does not lie entirely within the set. This is not the case for the set in Fig. A1.9, which is 
convex. 

In conclusion, the useful concept of an extreme point of a convex set needs to be 
introduced. 


Definition: An extreme point of a convex set is a point in the set that does not lie on 
any line segment that joins two other points in the set. 


Thus the extreme points of the convex set in Fig. A1.9 are (0, 0), (0, 2), (1, 2), (2, 1), 
(1, 0), and all the infinite number of points on the boundary between (2, 1) and (1, 0). If this 
particular boundary were a line segment instead, then the set would have only the five listed 
extreme points. 
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Appendix 


Classical Optimization 
Methods 


This appendix reviews the classical methods of calculus for finding a solution that maximizes 
or minimizes (1) a function of a single variable, (2) a function of several variables, and (3) a 
function of several variables subject to constraints on the values of these variables. It is assumed 
that the functions considered possess continuous first and second derivatives and partial deriv- 
atives everywhere. Some of the concepts discussed next have been introduced briefly in Secs. 
14.2 and 14.3. 

Consider a function of a single variable, such as that shown in Fig. A2.1. A necessary 
condition for a particular solution, x = x*, to be either a minimum or a maximum is that 


afte) _ 


0 atx = x*, 
dx 


Thus in Fig. A2.1 there are five solutions satisfying these conditions. To obtain more infor- 
mation about these five so-called critical points, it is necessary to examine the second derivative. 
Thus, if 

d*f( 


x 
oi > 0 atx = x*, 
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S(x) 
Global 

Local maximum 

maximum 


Inflection Minimum 
point 







Global 
minimum 


Figure A2.1 A function having several maxima and minima. 


then x* must be at least a local minimum [that is, f(x*) = f(x) for all x sufficiently close to 
x*]. Using the language introduced in Appendix 1, we can say that x* must be a local minimum 
if f(x) is strictly convex within a neighborhood of x*. Similarly, a sufficient condition for x* 
to be a local maximum (given that it satisfies the necessary condition) is that f(x) is strictly 
concave within a neighborhood of x* (that is, the second derivative is negative at x*). If the 
second derivative is zero, the issue is not resolved (the point may even be an inflection point), 
and it is necessary to examine higher derivatives. 

To find a global minimum [i.e., a solution x* such that f(x*) = f(x) for all x], it is 
necessary to compare the local minima and identify the one that yields the smallest value of 
f(x). If this value is less than f(x) as x — —©® and as x —> + © (or at the endpoints of the 
function, if it is defined only over a finite interval), then this point is a global minimum. Such 
a point is shown in Fig. A2.1, along with the global maximum, which is identified in an 
analogous way. 

However, if f(x) is known to be either a convex or concave function (see Appendix 1 
for a description of such functions), the analysis becomes much simpler. In particular, if f(x) 
is a convex function, such as the one shown in Fig. Al.1, then any solution x* such that 


afte) _ 


0 atx = x* 
dx 


is known automatically to be a global minimum. In other words, this condition is not only a 
necessary but a sufficient condition for a global minimum of a convex function. If this function 
actually is strictly convex, then this solution must be the only global minimum. (However, if 
the function is either always decreasing or always increasing, so the derivative is nonzero for 
all values of x, then there will be no global minimum at a finite value of x.) Otherwise, there 
could be a tie for the global minimum over a single interval where the derivative is zero. 
Similarly, if f(x) is a concave function, then having 


afo _ 


0 atx = x* 
dx 


becomes both a necessary and sufficient condition for x* to be a global maximum. 
The analysis for an unconstrained function of several variables f(x), where x = 


(Xis X2; -© -> Xp), is similar. Thus a necessary condition for a solution x = x* to be either a 
minimum or a maximum is that 
afa) , 
ae atx = x*, forj = 1,2,..., 7. 
x 


j 
After identifying the critical points that satisfy this condition, each such point is then classified 
as a local minimum or maximum if the function is strictly convex or strictly concave, respec- 


903 


Appendix2 Classical 
Optimization Methods 


904 
Appendixes 


tively, within a neighborhood of the point. (Additional analysis is required if the function is ` 
neither.) The global minimum and maximum would be found by comparing the local minima 
and maxima and then checking the value of the function as some of the variables approach — °° 
or +œ. However, if the function is known to be convex or concave, then a critical point must 
be a global minimum or a global maximum, respectively. 

Now consider the problem of finding the minimum or maximum of the function f(x), 
subject to the restriction that x must satisfy all the equations 


g(x) = b; 
g) = by 
8inl®) = On, 


where m < n. For example, if n = 2 and m = 1, the problem might be 
Maximize f(x, x2) = x? + 2x, 
subject to E&i X) =x? + x3 = 1. 


In this case (x,, x2) is restricted to be on the circle of radius 1, whose center is at the origin, 
so that the goal is to find the point on this circle that yields the largest value of f(x,, x2). This 
example is soon solved after a general approach to the problem is outlined. 

A classical method of dealing with this problem is the method of Lagrange multipliers. 
This procedure begins by formulating the Lagrangian function, 


hx, A) = fox) = B Ales) = bd, 


where the new variables A = (A,, Az, ..., Àm) are called Lagrange multipliers. Notice the 
key fact that for the feasible values of x, 


gtx) -b = 0, for all i, 


so h(x, A) = f(x). Therefore, it can be shown that if (x, A) = (x*, A*) is a local or global 
minimum or maximum for the unconstrained function h(x, A), then x* is a corresponding critical 
point for the original problem. As a result, the. method now reduces to analyzing h(x, A) by 
the procedure just described for unconstrained functions. Thus the (n + m) partial derivatives 
would be set equal to zero; that is, 


dha =, 98; 

SOE oe Sp sea ip. forj = 1,2,...,7, 
Ox; Ox, Z1 * Ox, 

oh 

a fori = 1,2,...,m, 


and then the critical points would be obtained by solving these equations for (x, A). Notice that 
the last m equations are equivalent to the constraints in the original problem, so only feasible 
solutions are considered. After further analysis to identify the global minimum or maximum of 
AC: ), the resulting value of x is then the desired solution to the original problem. 

It should be pointed out that from a practical computational viewpoint, the method of 
Lagrange multipliers is not a particularly powerful procedure. It is often essentially impossible 
to solve the equations to obtain the critical points. Furthermore, even when they can be obtained, 
the number of critical points may be so large (often infinite) that it is impractical to attempt to 
identify a global minimum or maximum. However, for certain types of small problems, this 


method can sometimes be used successfully. To illustrate, consider the example introduced 905 
earlier. In this case, Appendix 2 Classical 


A(t, x) = x2 + Ixy — AD? + x2 ~ 1), Optimization Methods 
ðh 
so that — = 2x, — 2Ax, = 0, 

Ox, 

ðh 

— = 2 — 2%, = 0, 

OX, 

oh 

a ee 1] = 0. 


The first equation implies that either A = 1 or x, = 0. If A = 1, then the other two equations 
imply that x, = 1 and x, = 0. If x, = 0, then the third equation implies that x, = +1. 
Therefore, the two critical points for the original problem are (x,, x.) =-(0, 1) and (0,.—1). 
Thus it is apparent that these points are the global maximum and minimum, respectively. 

In presenting the classical optimization methods just described, we have assumed that 
you are already familiar with derivatives and how to obtain them. However, there is a special 
case of importance in operations research work that warrants additional explanation, namely, 
the derivative of an integral. In particular, consider how to find the derivative of the function 


hy) 
F(y) = fœ, y) dx, 
g(y) 


where g( y) and A(y) are the limits of integration expressed as functions of y. To begin, suppose 
that these limits of integration are constants, so that g(y) = a and h(y) = b, respectively. For 
this special case, it can be shown that, given the regularity conditions assumed at the beginning 
of this appendix, the derivative is simply 


af x, da 
a fœ y) dx a ofa y) 
oy 
For example, if fœ, y) = e77, a = 0, and b = %, then 
d a ar) ° eae Sa on 1 
a o E vax = | (=x)? dx = — E 


at any positive value of y. Thus the intuitive procedure of interchanging the order of differ- 
entiation and integration is valid for this case. However, finding the derivative becomes a little 
more complicated than this when the limits of integration are functions. In particular, 


ma * afte, y) ip 


d = 
ay Je ay —_— dx + fh), y) Feo) y» = 


y 1 y) dx -Í 


where f(h(y), y) is obtained by writing out f(x, y) and then SN x by h( N wherever 
it appears, and similarly for f(g(y), y). To illustrate, if f(x, y) = x°y’, g(y) = y, and 
h(y) = 2y, then 

d 2 2y 


P7 xy dx= | 3x?y dx + (2) y2) — y?y) = 1475 


at any positive value of y. 
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Matrices and Matrix 
Operations 


A matrix is defined to be a rectangular array of numbers. For example, 


2 35 
A=1]3 0 
1 1 


isa 3 x 2 matrix (where 3 x 2 denotes ‘‘3 by 2°’) because it is a rectangular array of numbers 
with three rows and two columns. (Matrices are denoted in this book by boldface capital 
letters.) The numbers in the rectangular array are called the elements of the matrix. For 
example, 


p=] 1 24 0 V3 
~|-4 2 -1 15 
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is a2 x 4 matrix whose elements are 1, 2.4, 0, V3, —4, 2, —1, and 15. Thus, in more 
general terms, 





Qi Ai 777 Ain 
4, 422 °’? Gan 
a=: : | = lal 
ant Am2 Amn, 
isan m X n matrix, where a,,,... , Amn represent the numbers that are the elements of this 
matrix; ||q,,|| is shorthand notation for identifying the matrix whose element in row i and column 
jis a; for every i = 1,2,...,mandj=1,2,...,a. 


Because matrices do not possess a numerical value, they cannot be added, multiplied, 
and so on as if they were individual numbers. However, it is sometimes desirable to perform 
certain manipulations on arrays of numbers. Therefore, rules have been developed for perform- 
ing operations on matrices that are analogous to arithmetic operations. To describe these, let 
A = |la,|| and B = ||b;|| be two matrices having the same number of rows and the same 
number of columns. Then A and B are said to be equal (A = B) if and only if all of the 
corresponding elements are equal (a; = b; for all i and j). The operation of multiplying a 
matrix by a number (denote this number by &) is performed by multiplying each element of the 
matrix by k, so that l 


kA = |ika;. 


si 


TER AERE 
5 0 BBS Ns 0 -9f 


To add A and B, simply add the corresponding elements, so that 
A+ B= lja; + byll 


’ 5 3 2 0 Tr 3 
To illustrate, i J + E = i T 


Similarly, subtraction is done as 
i 
A-B=A + (-1B, 


For example, 





so that A — B= |a; bj 


F l 5 3) |2 0] _| 3 3 
or example, 1 6 3 4/=|-2 5} 


Note that, with the exception of multiplication by a number, all the preceding operations are 
defined only when the two matrices involved are of the same size. However, all of them are 
straightforward because they involve performing only the same comparison or arithmetic op- 
eration on the corresponding elements of the matrices. 

There exists one additional elementary operation that has not been defined, matrix mul- 
tiplication, but it is considerably more complicated. To find the element in row i, column j of 
the matrix resulting from multiplying A times B, it is necessary to multiply each element in 
row i of A by the corresponding element in column j of B and then to add these products. 
Therefore, the matrix multiplication is defined if and only if the number of columns of A equals 
the number of rows of B, because this condition is required if we are to perform the specified 
element-by-element multiplication. Thus, if A is an m X n matrix and B is ann X r matrix, 
then their product is 


AB = 














n 
5 OD Kj 
k= 
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1 2 3 4 13) + x2) 11) + 2(5) 7 11 
To illustrate, 4 0 E | = |43) + 0(2) 41) + 065) |} = 4712 44. 
2 3 2(3) + 3(2) 21) + 36) 12 17 


On the other hand, if one attempts to multiply these matrices in the reverse order, the resulting 


product, 
34 1 2 
7 5 4 0], 
2 3 
is not even defined. Even when both AB and BA are defined, 
AB # BA 


in general. Thus matrix multiplication should be viewed as a specially designed operation whose 
properties are quite different from those of arithmetic multiplication. To understand why this 
special definition was adopted, consider the following system of equations: 


2x, — x, + 5x; + x4 = 20 


x, + 5x, + 4x3 + 5x4 = 30 





3x, + x, — 6x, + 2x, = 20. 


Rather than writing these equations out as shown here, they can be written much more concisely 
in matrix form as 


Ax = b, 
x 
2-1 5 1 20 
where A=I|1 5 4 S21; x= A , b= |30 
3 1 -6 2 ry 20 


It is this kind of multiplication for which matrix multiplication is designed. 

Carefully note that matrix division is not defined. 

Although the matrix operations described here do not possess certain of the properties 
of arithmetic operations, they do satisfy the following laws: 


A+B=B+A4, 
(A+B) +C=A+4+(Bt+O), 
A(B + C) = AB + AC, 
A(BC) = (AB)C, 


i 


i 


when the relative sizes of these matrices are such that the indicated operations are defined. 

Another type of matrix operation, which has no arithmetic analog, is the transpose 
operation. This operation involves nothing more than interchanging the rows and columns of 
the matrix, which is frequently useful for performing the multiplication operation in the desired 
way. Thus, for any matrix A = |q,,||, its transpose AT is 


AT = [jall 
25 


For example, if A= 


> 


13 
4 0 


r |2 1 4 
then eer 


Zero and 1 are numbers that play a special role in arithmetic. There also exist special 909 
matrices that play a similar role in matrix theory. In particular, the matrix that is analogous to 


; s : i BER : Appendix 3 Matrices 
1 is the identity matrix I, which is a square matrix whose elements are zeroes except for ones 


and Matrix Operations 
along the main diagonal. Thus 
10 0 > 0 
O 1 -0 
l=|0 0 1 0j, 
000 >. I 


The number of rows or columns of I can be specified as desired. The analogy of I to 1 follows 
from the fact that for any matrix A, 


IA = A = AT, 


where I is assigned the appropriate number of rows and columns in each case for the multi- 
plication operation to be defined. Similarly, the matrix that is analogous to zero is the so-called 
null matrix 0, which is a matrix of any size whose elements are all zeroes. Thus 


0 0 eee 0 
00 >.’ 0 
o=|- - ; 
0 0 >.’ 0 


Therefore, for any matrix A, 
A+O=A, A-~A=0, and 0A = 0 = AÌ, 


where 0 is the appropriate size in each case for the operations to be defined. 
On certain occasions, it is useful to partition a matrix into several smaller matrices called 
submatrices. For example, one possible way of partitioning a3 X 4 matrix would be 


= — |% Ay 
sia bs a | 
G3) 


ag; azn 423, 24 
where An = [@y., 3, yal, Ao = Be Ay = bs a33 A34 





all are submatrices. Rather than perform operations element by element on such partitioned 
matrices, we can instead do them in terms of the submatrices, provided the partitionings are 
such that the operations are defined. For example, if B is a partitioned 4 x 1 matrix such that 


b, 


|a] _ |b 
BS bz a 


b, 


then AB = wwe i+ reat 


And, + ArB 


A special kind of matrix that plays an important role in matrix theory is the kind that 
has either a single row or a single column. Such matrices are often referred to as vectors. Thus 


x= [kis X2 o e Xd 
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is a row vector, and 


is a column vector. (Vectors are denoted in this book by boldface lowercase letters.) These 
vectors also are sometimes called n-vectors to indicate that they have n elements. For example, 


x = (1,4, —2, 4, 7] 


is a 5-vector. A null vector 0 is either a row vector or a column vector whose elements are 
all zeroes, i.e., 


0 = [0,0,..., 0], 0 
0 
(Although the same symbol 0 is used for either kind of null vector, as well as for a null matrix, 
the context normally will identify which it is.) 
One reason vectors play an important role in matrix theory is that any m X n matrix can 
be partitioned into either m row vectors or n column vectors, and important properties of the 


matrix can be analyzed in terms of these vectors. To amplify, consider a set of n-vectors, 
X,, X2, - -© > Xm, Of the same type (i.e., they are either all row vectors or all column vectors). 


Definition: A set of vectors X}, X.,... , X,, is said to be linearly dependent if there , 
exist m numbers (denoted by c,, c,,... , Cm), some of which are not zero, such that 
CX, + eX, +++ + CmXm = 0. 


Otherwise, the set is said to be linearly independent. 


To illustrate, if m = 3 and 


x, = [1,1,1] 

x, = [0, 1, 1] 

x; = [2, 5, 5], 
then 2x, + 3x, — x; = 0, 
so that X, = 2x, + 3x2. 


Thus x), X2, X, would be linearly dependent because one of them is a linear combination of 
the others. However, if x; were changed to 


X3 = [2, 5, 6] 
instead, then x,, X2, X, would be linearly independent. 


Definition: The rank of a set of vectors is the largest number of linearly independent 
vectors that can be chosen from the set. 


Continuing the preceding example, the rank of the set of vectors x,, X2, X, was 2, but 
it became 3 after changing x;. 


Definition: A basis for a set of vectors is a collection of linearly independent vectors 
taken from the set such that every vector in the set is a linear combination of the vectors 


in the collection (i.e., every vector in the set equals the sum of certain multiples of the 
vectors in the collection). 


To illustrate, x, and x, constituted a basis for x,, x,, X, in the preceding example before 
X, was changed. 


Theorem A3.1: A collection of r linearly independent vectors chosen from a set of 
vectors is a basis for the set if and only if the set has rank r. 


Given the preceding results regarding vectors, it is now possible to present certain im- 
portant concepts regarding matrices. 


Definition: The row rank of a matrix is the rank of its set of row vectors. The column 
rank of a matrix is the rank of its column vectors. 


For example, if the matrix A is 
1 1 1 
A={0 1 1f 
25 5 


then its row rank was shown to be 2. Note that the column rank of A is also 2. This fact is no 
coincidence, as the following general theorem indicates. 


Theorem A3.2: The row rank and column rank of a matrix are equal. 


Thus it is only necessary to speak of the rank of a matrix. 
The final concept to be discussed is that of the inverse of a matrix. For any nonzero 
number k, there exists a reciprocal or inverse, k7! = 1/k, such that 


ket = kk = 1. 


Is there an analogous concept that is valid in matrix theory? In other words, for a given matrix 
A other than the null matrix, does there exist a matrix AT! such that 


AA =A'A =R 


If A is not a square matrix (i.e., if the number of rows and columns of A differ), the answer 
is never, because these matrix products would necessarily have a different number of rows for 
the multiplication to be defined (so that the equality operation would not be defined). However, 
if A is square, then the answer is under certain circumstances, as indicated in Theorem A3.3. 


Definition: A matrix is called nonsingular if its rank equals both the number of rows 
and the number of columns. Otherwise, it is called singular. 


Thus only square matrices can be nonsingular. A useful way of testing for nonsingularity 
is provided by the fact that a square matrix is nonsingular if and only if its determinant is 
nonzero. 


Theorem A3.3: (a) If A is nonsingular, there is a unique nonsingular matrix AT}, 
called the inverse of A, such that AAT! = I = A7!A, 
(b) If A is nonsingular and B is a matrix for which either AB = I or 
BA = I, then B = A™!. 
(c) Only nonsingular matrices have inverses. 
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To illustrate, consider the matrix 


ne f a 


Notice that the rank of A is 2, so it is nonsingular. Therefore, A must have an inverse, which 


happens to be 
1 —4 
Sl ga 
aifi of] 


5 —4]|1 —4 
=La = 
Hence, AA f -1 f =| 


ae 


and ATIA = 


|] 
m1 
pe 
bed 
np 


Appendix 


Simultaneous Linear 
Equations 


Consider the system of simultaneous linear equations 


aX + aX + +++ + ainn = dy, 
aX) + aX + te + anăXn = ba, 
AmyXy + am2X2 + °° + amnXn = Bbm- 


It is commonly assumed that this system has a solution, and a unique solution, if and only if 
m = n, However, this assumption is an oversimplification. It raises the questions: Under what 
conditions will these equations have a simultaneous solution? Given that they do, when will 
there be only one such solution? If there is a unique solution, how can it be identified in a 
systematic way? These questions are the ones we explore in this appendix. The discussion of 
the first two questions assumes that you are familiar with the basic information about matrices 
in Appendix 3. 
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The preceding system of equations can also-be written in matrix form as 


Ax = b, 
i n ain xy by 
ar an Azn Xo by 
where A= : A x=j|:j, b= 
Amı Am2 Ri Amn Xn bm 


The first two questions can be answered immediately in terms of the properties of these matrices. 
First, the system of equations possesses at least one solution if and only if the rank of A equals 
the rank of [A, b]. (Notice that equality is guaranteed if the rank of A equals m.) This result 
follows immediately from the definitions of rank and linear independence given in Appendix 
3, because if the rank of [A, b] exceeds the rank of A by 1 (the only other possibility), then 
b is linearly independent of. the column vectors of A. (that is, b cannot equal any linear 
combination Ax of these vectors). 

Second, given that these ranks are equal, there are then two possibilities. If the rank of 
A is n (its maximum possible value), then the system of equations will possess exactly one 
solution. [This result follows from Theorem A3.1, the definition of a basis, and part.(b) of 
Theorem A3.3.] If the rank of A is less than n, then there will exist an infinite number of 
solutions. (This result follows from the fact that for any basis of the column vectors of A, the 
x; corresponding to column vectors not in this basis can be assigned any value, and there will 
still exist a solution for the other variables as before.) 

Finally, it should be noted that if A and [A, b] have a common rank r such that r < m, 
then (m — r) of the equations must be linear combinations of the other ones, so that these 
(m — r) redundant equations can be deleted without affecting the solution(s). It then follows 
from the preceding results that this system of equations (with or without the redundant equations) 
possesses at least one solution, where the number of solutions is one if r = n or infinite if 
r<n. 

Now consider how to find a solution to the system of equations. Assume for the moment 
that m = nand A is nonsingular, so that a unique solution exists. This solution can be obtained 
by the Gauss-Jordan method of elimination (commonly called Gaussian elimination), which 
proceeds as follows. To begin, eliminate the first variable from all but one (say, the first) of 
the equations by adding an appropriate multiple (positive or negative) of this equation to each 
of the others. (For convenience, this one equation would be divided by the coefficient of this 
variable, so that the final value of this coefficient is 1.) Next, proceed in the same way to 
eliminate the second variable from all equations except one new one (say, the second). Then 
repeat this procedure for the third variable, the fourth variable, and so on, until each of the n 
variables remains in only one of the equations and each of the n equations contains exactly one 
of these variables. The desired solution can then be read from the equations directly. 

To illustrate the Gauss-Jordan method of elimination, we consider the following system 
of linear equations: 


(1) xi X% + 4x3 = 10 
(2) =x + 3x, = 10 
(3) 2x, + 5x3 = 22. 


The method begins by eliminating x, from all but the first equation. This first step is executed 
simply by adding Eq. (1) to Eq. (2), which yields 


(1) xX, = x + 4x, = 10 
(2) 2x, + 4x, = 20 
(3) 2x3 + 5x, = 22. 


The next step is to eliminate x, from all but the second equation. Begin this step by dividing 
Eq. (2) by 2, so that x, will have a coefficient of +1, as follows: 


(1) xı — X, + 4x5 = 10 
(2) xX, + 2x; = 10 
(3) 2x, + 5x, = 22. 
Then add Eq. (2) to Eq. (1), and subtract two times Eq. (2) from Eq. (3), which yields 
(1) xy + 6x; = 20 
(2) x2 + 2x, = 10 
(3) x3 = 2. 


The final step is to eliminate x, from all but the third equation. This step requires subtracting 
six times Eq. (3) from Eq. (1) and subtracting two times Eq. (3) from Eq. (2), which yields 


(1) x =8 
(2) X =6 
(3) x = 2. 


Thus the desired solution is (x,, x2, x3) = (8, 6, 2), and the procedure is completed. 

Now consider briefly what happens if the Gauss-Jordan method of elimination is applied 
when m # n and/or A is singular. As we discussed earlier, there are three possible cases to 
consider. First, if the rank of [A, b] exceeds the rank of A by 1, then no solution to the system 
of equations will exist. In this case, the Gauss-Jordan method obtains an equation where the 
left-hand side has vanished (i.e., all the coefficients of the variables are zero), whereas the 
right-hand side is nonzero. This signpost indicates that no solution exists, so there is no reason 
to proceed further. 

The second case is where both of these ranks are equal to n, so that a unique solution 
exists. This case implies that m = n. If m = n, then the previous assumptions must hold and 
no difficulty arises. Therefore, suppose that m > n, so that there are (m — n) redundant 
equations. In this case, all these redundant equations are eliminated (i.e., both the left-hand 
and right-hand sides would become zero) during the process of executing the Gauss-Jordan 
method, so the unique solution is identified just as it was before. 

The final case is where both the ranks are equal to r, where r < n, so that the system 
of equations possesses an infinite number of solutions. In this case, at the completion of the 
Gauss-Jordan method, each of the r variables remains in only one of the equations, and each 
of the r equations (any additional equations have vanished) contains exactly one of these 
variables. However, each of the other (n — r) variables either vanishes or remains in some of 
the equations. Therefore, any solution obtained by assigning arbitrary values to the (n — r) 
variables, and then identifying the respective values of the r variables from the single final 
equation in which each one appears, is a solution to the system of simultaneous equations. 
Equivalently, the transfer of these (n — r) variables to the right-hand side of the equations 
(either before or after the method is executed) identifies the solutions for the r variables as a 
function of these extra variables. 


915 


Appendix 4 
Simultaneous 
Linear Equations 





Appendix 


Table A5.1 Areas Under the Normal Curve from K, to œ 


Tables P{normal = K,} = k = ex = a 


Ka -00 .01 -02 .03 04 .05 .06 .07 08 | .09 
EEA 








0.0 | .5000 | .4960 | .4920 | .4880 | .4840 | .4801 .4761 .4721 .4681 .4641 
0.1 | .4602 | .4562 | .4522 | .4483 | .4443 4404 | .4364 | 4325 |.4286 | .4247 
0.2 | .4207 | .4168 | .4129 | .4090 | .4052 | .4013 | .3974 | .3936 | .3897 | .3859 
0.3 | .3821 | 3783 | .3745 | .3707 | 3669 | 3632 | 3594 | 3557 | 3520 | .3483 
0.4 | .3446 | 3409 | .3372 | .3336 | .3300 | 3264 | .3228 | 3192 | .3156 | .3121 


0.5 | 3085 | 3050 | .3015 | .2981 | .2946 | .2912 | .2877 | .2843 | .2810 | .2776 
0.6 | .2743 | .2709 | .2676 | 2643 | .2611 | 2578 | 2546 | 2514 | .2483 | .2451 
0.7 | .2420 | 2389 | 2358 | 2327 | 2296 | 2266 | .2236° | 2206 | .2177 | .2148 
0.8 | .2119 | .2090 | .2061 | .2033 | .2005 | .1977 |.1949 | .1922 | .1894 | .1867 
0.9 | .1841 | .1814 | 1788 | .1762 | 1736 | .1711 | .1685 | .1660 | .1635 | 1611 
1.0 | 1587 | .1562 | 1539 | .1515 | .1492 | 1469 | .1446. | 1423 | .1401 | .1379 
1.1 | .1357 | 1335 | .1314 | .1292 | 1271 | .1251 | .1230 | .1210 | .1190 | .1170 
1.2 | 1151 | 1131 | 1112 | .1093 | .1075 | .1056 | 1038 |°.1020 | .1003 | .0985 
1.3 | .0968 | .0951 | .0934 | .0918 | .0901 | .0ggs | .0869 | .0853 | .0838 | .0823 
1.4 | 0808 | .0793 | .0778 | .0764 | .0749 | .0735 | .0721 | .0708 | .0694 | .0681 
1.5 | .0668 | .0655 | .0643 | .0630 | .0618 | .0606 | .0594 | .0582 | .0571 | .0559 
1.6 | 0548 | .0537 | .0526 | .0516 | .0505 | .0495 | .0485 | .0475 | 0465 | .0455 
1.7 | .0446 | .0436 | .0427 | .0418 | .0409 | .0401 | .0392 | .0384 | .0375 | .0367 
1.8 | 0359 | 0351 | .0344 | .0336 | .0329 | .0322 | .0314 | .0307 | .0301 | .0294 
1.9 | 0287 | 0281 | .0274 | .0268 | .0262 | .0256 | .0250 | .0244 | .0239 | .0233 
2.0 | 0228 | .0222 | .0217 | .0212 | .0207 | .0202 | .0197 | .0192 | .0188 | .0183 
2.1 | 0179 | 0174 | .0170 | 0166 | 0162 | .0158 | 0154 | .0150 | .0146 | .0143 
2.2 | 0139 | .0136 | .0132 | 0129 | .0125 | .0122 | .0119° | .0116 | .0113 | .0110 
2.3 | 0107 | .0104 | .0102 | .00990 |..00964 | .00939 | .00914 | .00889 | .00g66 | .00842 
2.4 | 00820 | .00798 | .00776 | .00755 | .00734 | .00714 | .00695 | .00676 | .00657 | .00639 
2.5 | 00621 | .00604 | .00587 | .00570 | .00554 | .00539 | .00523 | .00508 | .00494 | .00480 
2.6 | .00466 | .00453 | .00440 | .00427 | .00415 | .00402 | .00391 | .00379 | .00368 | .00357 
2.7 | 00347 | .00336 | .00326 | .00317 | .00307 | .00298 | .00289 | .00280 | .00272 | .00264 
2.8 | 00256 | .00248 | .00240 | .00233 | .00226 | .00219 | .00212 | .00205 | .00199 | .00193 
2.9 | .00187 | .00181 | .00175 | .00169 | .00164 | .00159 | .00154 | .00149 | .00144 | .00139 


ee —— S 
Ka .0 wl 2 .3 4 5 -6 Tf 8 9 
ol eee 


00135 | .0°968 | .0°687 | .0°483 | .0°337 | .0°233 | .03159 | .07108 | .0*723 | .0*481 
.0*317 | .0*207 | .0*133 | .0°854 | .0°541 | .0°340 | .05211 | .0°130 | .0°793 | .0°479 
.06287 | .0°170 | .07996 | .07579 | .07333 | .07190 | .07107 | .08599 | .0°332 | .0%182 
.0°987 | .0°530 | .0°282 | .0°149 | .0'°777 | .0'°402 | .0°206 | .0'°104 | .0''523 | .0''260 





























AMU W 














Source: Croxton, F. E.: Tables of Areas in Two Tails and in One Tail of the Normal Curve. Copyright 
1949 by Prentice-Hall, Inc., Englewood Cliffs, N.J. 


916 


Table A5.2 100 a Percentage Points of Student’s ¢ Distribution 
P{Student’s t with v Degrees of Freedom = Tabled Value} = a 


0.40 


0.25 


0.10 








0.325 
0.289 
0.277 
0.271 


0.267 
0.265 
0.263 
0.262 
0.261 


10 | 0.260 
0.260 
0.259 
0.259 
0.258 
0.258 
0.258 
0.257 
0.257 


OOND BWW = 


































Source: Table 12 of Biometrika Tables for Statisticians, vol. I, 3d ed., 1966, by permission of the 
Biometrika Trustees. 





1.000 | 3.078 


0.816 
0.765 
0.741 


0.727 
0.718 
0.711 
0.706 
0.703 
0.700 
0.697 
0.695 
0.694 
0.692 
0.691 
0.690 
0.689 
0.688 





1.886 
1.638 
1.533 
1.476 
1.440 
1.415 
1.397 
1.383 
1.372 
1.363 
1.356 
1.350 
1.345 
1.341 
1.337 
1.333 
1.330 





0.05 


6.314 
2.920 
2.353 
2.132 


2.015 
1.943 
1.895 
1.860 
1.833 


1.812 
1.796 
1.782 
1.771 
1.761 


1.753 
1.746 
1.740 
1.734 





0.025 


12.706 
4.303 
3.182 
2.776 
2.571 
2.447 
2.365 
2.306 
2.262 
2.228 
2.201 


2.179 


2.160 
2.145 


2.131 
2.120 
2.110 
2.101 
2.093 


2.086 
2.080 
2.074 
2.069 
2.064 
2.060 
2.056 
2.052 
2.048 
2.045 
2.042 
2.021 
2.000 
1.980 









0.005 










0.01 + 
31.821 
6.965 


4.541 
3.747 


3.365 
3.143 
2.998 
2.896 
2.821 


2.764 
2.718 
2.681 
2.650 
2.624 


2.602 
2.583 
2.567 
2.552 
2.539 


2.528 
2.518 
2.508 
2.500 
2.492 


2.485 
2.479 
2.473 
2.467 
2.462 
2.457 
2.423 
2.390 
2.358 
2,326 









63.657 
9.925 
5.841 
4.604 


4.032 
3.707 
3.499 
3.355 
3.250 


3.169 
3.106 
3.055 
3.012 
2.977 


2.947 
2.921 
2.898 
2.878 
2.861 


2.845 
2.831 
2.819 
2.807 
2.797 
2.787 
2.779 
2.771 
2.763 
2.756 


2.750 
2.704 
2.660 
2.617 
2.576 





7.453 
5.598 


4.773 
4.317 
4.029 
3.833 
3.690 


3.581 
3.497 
3.428 
3.372 
3.326 


3.286 
3.252 
3.222 
3.197 
3.174 
3.153 
3.135 
3.119 
3.104 
3.091 


3.078 
3.067 
3.057 
3.047 
3.038 


3.030 
2.971 
2.915 
2.860 





7.173 


5.893 
5.208 
4.785 
4.501 
4.297 


4.144 
4.025 
3.930 
3.852 
3.787 


3.733 
3.686 
3.646 
3.610 
3.579 
3.552 
3.527 
3.505 
3.485 
3.467 


3.450 
3.435 
3.421 
3.408 
3.396 
3.385 
3.307 
3.232 











12.924 
8.610 


6.869 
5.959 
5.408 
5.041 
4,781 


4.587 
4.437 
4.318 
4.221 
4.140 
4.073 
4.015 
3.965 
3.922 
3.883 
3.850 
3.819 
3.792 
3.767 
3.745 
3.725 
3.707 
3.690 
3.674 
3.659 
3.646 
3.551 
3.460 
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Table A5.3 100 a Percentage Points of Chi-Square Distribution 
P{Chi Square with v Degrees of Freedom = Tabled Value} = a 









































































0.995 
1 0.04393 0.09157 0.07982 0.00393 
2 0.0100 0.0201 0.0506 0.103 
3 0.0717 0.115 0.216 0.352 2.366 
4 0.207 0.297 0.484 0.711 3.357 
5 0.412 0.554 0.831 1.145 4.351 
6 0.676 0.872 1.237 1.635 5.348 
7 0.989 1.239 1.690 2.167 6.346 
8 1.344 1.646 2.180 2.733 7.344 
9 1.735 2.088 2.700 3.325 8.343 
10 2.156 2.558 3.247 3.940 9.342 
11 2.603 3.053 3.816 4.575 10.341 
12 3.074 3.571 4.404 5.226 11.340 
13 3.565 4.107 5.009 5.892 12.340 
14 4.075 4.660 5.629 6.571 13.339 
15 4.601 5.229 6.262 7.261 14.339 
16 5.142 5.812 6.908 7.962 15.338 
17 5.697 6.408 7.564 8.672 16.338 
18 6.265 7.015 8.231 9.390 17.338 
19 6.844 7.633 8.907 10.117 18.338 
20 7.434 8.260 9.591 10.851 19.337 
21 8.034 8.897 10.283 11.591 20.337 
22 8.643 9.542 10.982 12.338 21.337 
23 9.260 10.196 11.688 13.091 22.337 
24 9.886 10.856 12.401 13.848 23.337 
25 | 10.520 11.524 13.120 14.611 24.337 
26 | 11.160 12.198 13.844 15.379 25.336 
27 | 11.808 12.879 14.573 16.151 26.336 
28 | 12.461 13.565 15.308 16.928 27.336 
29 | 13.121 14.256 16.047 17.708 28.336 
30 | 13.787 14.953 16.791 18.493 29.336 
40 | 20.707 22.164 24.433 26.509 39.335 
50 | 27.991 29.707 32.357 34.764 49.335 
60 | 35.535 37.485 40.482 43.188 59.335 
70 | 43.275 45.442 48.758 51.739 69.334 
80 | 51.172 53.540 57.153 60.391 79.334 
90 | 59.196 61.754 65.647 69.126 89.334 
100 | 67.328 70.065 74.222 77.929 99.334 
—2.576 —2.326 — 1.960 — 1.645 

















Source: Abridged from Table 8 of Biometrika Tables for Statisticians, vol. 1, 3d ed., 1966, by per- 
mission of the Biometrika Trustees. 


Table A5.3 (continued) 

















0.25 0.10 0.05 0.025 
1.323 g 2.706 3.841 5.024 
2.773 4.605 5.991 7.378 
4.108 6.251 7.815 9.348 
5.385 7.779 9.488 11.143 
6.626 9.236 11.070 12.832 
7.841 10.645 12.592 14.449 
9.037 12.017 14.067 16.013 
10.219 13.362 15.507 17.535 
11.389 14.684 16.919 19.023 
12.549 15.987 18.307 20.483 
13.701 17.275 19.675 21.920 
14.845 18.549 21.026 23.337 
15.984 19.812 22.362 24.736 
17.117 21.064 23.685 26.119 
18.245 22.307 24.996 27.488 
19.369 23.542 26.296 28.845 
20.489 24.769 27.587 30.191 
21.605 25.989 28.869 31.526 
22.718 27.204 30.144 32.852 
23.828 28.412 31.410 34.170 
24.935 29.615 32.671 35.479 
26.039 30.813 33.924 36.781 
27.141 32.007 35.172 38.076 
28.241 33.196 36.415 39.364 
29.339 34.382 37.652 40.646 
30.434 35.563 38.885 41.923 
31.528 36.741 40.113 43.194 
32.620 37.916 41.337 44.461 
33.711 39.087 42.557 45.722 
34.800 40.256 43.773 46.979 
45.616 51.805 55.758 59.342 
56.334 63.167 67.505 71.420 
66.981 74.397 79.082 83.298 
77.577 85.527 90.531 95.023 
88.130 96.578 101.879 106.629 
98.650 107.565 113.145 118.136 
109.141 118.498 124.342 129.561 
+0.6745 + 1.282 + 1.645 +1.960 
For v > 100 take 
2 
Fe ac et 
x =v (: J + Ka 


according to the degree of accuracy required. K, is the standardized normal deviate corresponding to œ 





0.01 


6.635 
9.210 
11.345 
13.277 
15.086 


16.812 
18.475 
20.090 
21.666 
23.209 


24.725 
26.217 
27.688 
29.141 
30.578 


32.000 
33.409 
34.805 
36.191 
37.566 
38.932 
40.289 
41.638 
42.980 
44.314 
45.642 
46.963 
48.278 
49.588 
50.892 
63.691 
76.154 
88.379 

100.425 

112.329 

124.116 

135.807 

+2.326 





0.005 


7.879 
10.597 
12.838 
14.860 
16.750 


18.548 
20.278 
21.955 
23.589 
25.188 


26.757 
28.300 
29.819 
31.319 
32.801 


34.267 
35.718 
37.156 
38.582 
39.997 


41.401 
42.796 
44.181 
45.558 
46.928 


48.290 
49.645 
50.993 
52.336 
53.672 
66.766 
79.490 
91.952 
104.215 
116.321 
128.299 
140.169 
+2.576 








2 2 
2) or HDS væ=5], 


and is shown in the bottom line of the table. 





a 

0.001 v 
10.828 1 
13.816 2 
16.266 3 
18.467 4 
20.515 5 
22.458 6 
24.322 7 
26.125 8 
27.877 9 
29.588 10 
31.264 11 
32.909 12 
34.528 13 
36.123 14 
37.697 15 
39.252 16 
40.790 17 
43.312 18 
43.820 19 
45.315 20 
46.797 21 
48.268 22 
49.728 23 
_ 31.179 24 
52.620 25 
54.052 26 
55.476 27 
56.892 28 
58.302 29 
59.703 30 
73.402 40 
86.661 50 
99.607 60 
112.317 70 
124.839 80 
137.208 90 
149.449 100 

+3.090 K, 
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Table A5.4 Summation of Terms of the Poisson Distribution: 1,000P{Poisson with Parameter A = c} 
































NS 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 
0 990 980 970 961 951 942 932 923 914 
1 1,000 1,000 1,000 999 999 998 998 997 996 
2 1,000 1,000 1,000 1,000 1,000 1,000 
j 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 E 
905 861 819 779 741 705 670 638 607 
995 990 982 974 963 951 938 925 910 
1,000 999 999 998 996 994 992 989 986 
1,000 1,000 1,000 1,000 1,000 999 999 998 
1,000 1,000 1,000 
c 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 
0 577 549 522 497 412 449 427 407 387 368 
1 894 878 361 844 827 809 791 772 754 736 
2 982 977 972 966 959 953 945 937 929 920 
3 998 997 996 994 993 991 989 987 984 981 
4 1,000 1,000 999 999 999 999 998 998 997 996 
5 1,000 1,000 1,000 1,000 1,000 1,000 1,000 999 
6 1,000 
A 
1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 
350 333 317 301 287 273 259 247 235 223 
717 699 681 663 645 627 609 592 575 558 
910 900 890 879 868 857 845 833 821 809 
978 974 970 966 962 957 952 946 940 934 
996 995 993 992 991 989 988 986 984 981 
999 999 999 998 998 998 997 997 996 996 
1,000 1,000 1,000 1,000 1,000 1,000 999 999 999 999 
1,000 1,000 1,000 1,000 
Ss 1.55 1.60 1.65 1.70 1.75 1.80 1.85" 1.90 1.95 2.00 
0 212 202 192 183 174 165 157 150 142 135 
1 541 525 509 493 478 463 448 434 420 406 
2 796 783 710 757 744 731 717 704 690 677 
3 928 921 914 907 899 891 883 875 866 857 
4 979 976 973 970 967 964 960 956 952 947 
5 995 994 993 992 991 990 988 987 985 983 
6 999 999 998 998 998 997 997 997 996 995 
7 1,000 1,000 1,000 1,000 1,000 999 999 999 999 999 
8 1,000 1,000 1,000 1,000 1,000 

















Source: Reproduced by permission from Hillier, F. S., and F. D. Lo: Tables for Multiple-Server 
Queueing Systems Involving Erlang Distributions, Technical Report #14, NSF GK-2925, Department of 
Operations Research, Stanford University, Stanford, Calif., December 28, 1971. 


Table A5.4 (continued) 





pe | OU amArANANMNAWNH SO 















Re 





122 
380 
650 
839 
938 
980 
994 
999 
1,000 





2.20 


111 
355 
623 
819 
928 
975 
993 
998 
1,000 


2.30 


100 
331 
596 
799 
916 
970 
991 
997 
999 
1,000 


2.40 


091 
308 


570 


779 
904 
964 
988 
997 
999 
1,000 


2.50 


082 
287 
544 
758 
891 
958 
986 
996 
999 
1,000 


2.60 


074 
267 
518 
736 
877 
951 
983 
995 
999 
1,000 


2.70 


067 
249 
494 
714 
863 
943 
979 
993 
998 
999 
1,000 


2.80 


061 
231 
469 
692 
848 
935 
976 
992 
998 
999 
1,000 


2.90 


055 
215 
446 
670 
832 
926 
971 
990 
997 
999 
1,000 


3.00 


050 
199 
423 
647 
815 
916 
966 
988 
996 
999 
1,000 








































3.10 3.20 3.30 3.40 3.50 

0 045 041 037 033 030 
1 185 171 159 147 136 
2 401 380 359 340 321 
3 625 603 580 558 537 
4 798 781 763 744 725 
5 906 895 883 871 858 
6 961 955 949 942 935 
7 986 983 980 977 973 
8 995 994 993 992 990 
9 999 998 998 997 997 
1,000 1,000 999 999 999 

1,000 1,000 1,000 

4.10 4.20 4.30 4.40 4.50 

017 015 014 012 011 

085 078 072 066 061 

224 210 197 185 174 

414 395 377 359 342 

609 590 570 551 532 

769 753 737 720 703 

879 867 856 844 831 

943 936 929 921 913 

976 972 968 964 960 

990 989 987 985 983 

997 996 995 994 993 

999 999 998 998 998 

1,000 1,000 999 999 999 

1,000 1,000 1,000 


3.60 


3.70 


3.80 


3.90 








027 
126 
303 
515 
706 
844 
927 
969 
988 
996 
999 
1,000 


4.60 


010 
056 
163 
326 
513 
686 
818 
905 
955 
980 
992 
997 
999 
1,000 


025 
116 
285 
494 
687 
830 
918 
965 
986 
995 
998 
1,000 





022 
107 
269 
473 
668 
816 
909 
960 
984 
994 
998 
999 
1,000 





020 
099 
253 
453 
648 
801 
899 
955 
981 
993 
998 
999 
1,000 









4.00 


018 
092 
238 
433 
629 
785 
889 
949 
979 
992 
997 
999 
1,000 














4.70 4.80 4,90 5.00 
009 008 007 a 
052 048 044 040 
152 143 133 125 
310 294 279 265 
495 476 458 440 
668 651 634 616 
805 791 777 762 
896 887 877 867 
950 944 938 932 
978 975 972 968 
991 990 988 986 
997 996 995 995 
999 999 998 998 

1,000 1,000 999 999 
1,000 1,000 
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Table A5.4 (continued) 





034 
109 
238 
406 
581 
732 
845 
918 
960 
982 
993 
997 
999 
1,000 


031 
102 
225 
390 
563 
717 
833 
911 
956 
980 
992 
997 
999 
1,000 


029 
095 
213 
373 
546 
702 
822 
903 
951 
977 
990 
996 
999 
1,000 


027 
088 
202 
358 
529 
686 
309 
894 
946 
975 
989 


996 ` 


998 
999 
1,000 


024 
082 
191 
342 
512 
670 
797 
886 
941 
972 
988 
995 
998 
999 
1,000 


003 


022 021 
077 072 
180 170 
327 313 
495 478 
654 638 
784 771 
877 867 
935 929 
969 965 
986 984 
994 993 
998 997 
999 999 
1,000 1,000 


019 
067 
160 
299 
462 
622 
758 
857 
923 
961 
982 
992 
997 
999 
1,000 



















017 
062 
151 
285 
446 
606 
744 
847 
916 
957 
980 
991 
996 
999 
1,000 


















6.70 








Table A5.4 (continued) 





























x À 
c 7.10 7.20 7.30 7.40 7.50 8.00 8.50 9.00 9.50 10.00 
| 0 001 001 001 001 001 000 000 000 000 000 
1 007 006 006 005 005 003 002 001 001 000 
2 027 025 024 022 020 014 009 006 004 003 
3 077 072 067 063 059 042 030 021 015 010 
4 164 156 147 140 132 100 074 055 040 029 
5 288 276 264 253 241 191 150 116 089 067 
6 435 420 406 392 378 313 256 207 165 130 
7 584 569 554 539 325 453 386 324 269 220 
8 716 703 689 676 662 593 523 456 392 333 
9 820 810 799 788 716 717 653 587 522 458 
10 894 887 879 871 862 816 763 706 645 583 
11 942 937 932 926 921 888 849 803 752 697 
12 970 967 964 961 957 936 909 876 836 792 
13 986 984 982 980 978 966 949 926 898 864 
14 994 993 992 991 990 983 973 959 940 917 
15 997 997 996 996 995 992 986 978 967 951 
16 999 999 999 998 998 996 993 989 982 973 
17 1,000 1,000 999 999 999 998 997 995 991 986 
18 1,000 1,000 1,000 999 999 998 996 993 
19 1,000 999 999 998 997 
20 1,000 1,000 999 998 
21 1,000 999 
22 1,000 

11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 | 

0 000 000 000 000 000 000 000 000 000 000 
1 000 000 000 000 000 000 000 000 000 000 
2 002 001 001 001 000 000 000 000 000 000 
3 007 005 003 002 002 001 001 000 000 000 
4 021 015 011 008 005 004 003 002 001 001 
5 050 038 028 020 015 011 008 006 004 003 
6 102 079 060 046 035 026 019 014 010 008 
7 179 143 114 090 070 054 041 032 024 018 
8 279 232 191 155 125 100 079 062 048 037 
9 397 341 289 242 201 166 135 109 088 070 
521 460 402 347 297 252 211 176 145 118 
639 579 520 462 406 353 304 260 220 185 
742 689 633 576 519 463 409 358 311 268 
825 781 733 682 628 573 518 464 413 363 
888 854 815 772 725 675 623 570 518. 466 
932 907 878 844 806 764 718 669 619 568 
960 944 924 899 869 835 798 756 711 664 
978 968 954 937 916 890 861 827 790 749 
988 982 974 963 948 930 908 883 853 819 
994 991 986 979 969 957 942 923 901 875 
997 995 992 988 983 975 965 952 936 917 
999 998 996 994 991 986 980 971 960 947 
999 999 998 997 995 992 989 983 976 967 
1,000 1,000 999 999 998 996 994 991 986 981 
1,000 999 999 998 997 995 992 989 
1,000 999 999 998 997 996 994 
1,000 1,000 999 999 998 997 
1,000 999 999 998 
1,000 999 999 
1,000 1,000 
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Table A5.4 (continued) 











16 17 18 19 20 21 22 23 24 25 
1 000 000 000 000 000 000 000 000 000 000 
2 000 000 000 000 000 000 000 000 000 000 
3 000 000 000 000 000 000 000 000 000 000 
4 000 000 000 000 000 000 000 000 000 000 
5 001 001 000 000 000 000 000 000 000 000 
6 004 002 001 001 000 000 000 000 000 000 
7 010 005 003 002 001 000 000 000 000 000 
8 022 013 007 004 002 001 001 000 000 000 








Answers to Selected 
Problems 


Chapter 3 
1. (b) Maximize Z = 4,500x, + 4,500x,, 


subject to x s 1 


X% S 1 
5,000x, + 4,000x, = 6,000 
400x, + 500x, = 600 
and x, 20, X, = 0. 


3. œp) = (13, 5); Z = 31. 


Chapter 4 


4. (x,,X2,%3) = (0, 10, 63); Z = 70. 
16. (x x.) = (2, 1); Z = 7. 


925 


926 


Answers to 
Selected Problems 


21. 
26. 


10. 
14. 
27. 


18. 


21. 


24. 





@ x) = ( F, 1B); Z = 2. 
(41, X2; x3) = G, 3, 0), with Z = 7. 


Chapter 5 


> 


(a) (xı, X2) = (2, 2) is optimal. Other corner-point feasible solutions are (0, 0), (3, 0) 
and (0, 3). 

Œ, X2, x3) = (0, 3, $) is optimal. 

Œi, X2, X3, X4, Xs) = (0, 5, 0, 3, 0) with Z = 50 is optimal. 

(a) Right side is Z = 8, x, = 14, x, = 5, x, = II. 

(b) x, = 0, 2x, — 2x. + 3x3 = 5, x, +X) — x, = 3. 














Chapter 6 
(a) Minimize yọ = 15y; + 12y, + 45y;, 
subject to =y, + ya + 5y; = 10 
2y, + ya + 3y; = 20 
and yı = 0, y. = 0, y3 = 0. 
(c) Complementary Basic Solutions 
Primal Problem Dual Problem 
Basic Solution Feasible? Z=» Feasible? Basic Solution 
(0, 0, 20, 10) Yes 0 No (0, 0, —6, —8) 
(4, 0, 0, 6) Yes 24 No (12, 0, 0, — 52) 
(0, 5, 10, 0) Yes 40 No (0, 4, —2, 0) 
(23, 3%, 0, 0) Yes and optimal 45 Yes and optimal | (3, 34, 0, 0) 
(10, 0, —30, 0) | No 60 Yes (0, 6, 0, 4) 
(0, 10,0, — 10) | No 80 Yes (4, 0, 14, 0) 











Maximize yọ = 8y, + 6y», 


subject to y, + 3y, S 2 
4y, + 2y, = 3 
2y; =i 
and y, 20, y, = 0. 
(a) Minimize yo = 30y, + 20y, + 25ys, 
subject to l ya — 3y; = —1 
3y, — Yot y= 2 
Jı 2 ae 1 
and y, = 0, y, = 0, y; 2 0. 
(d) Not optimal, since 2y, + 3y, = 3 is violated for yý = 4, y3 =2 
(f) Not optimal, since 3y, + 2y, = 2 is violated for yf = 4, y = 2. 


35. 


36. 


12. 


19. 


24. 
30. 


39. 


15. 













New Basic Solution 927 
Part (X15 Xa, X3, Xas X5) Feasible? | Optimal? Aiswerin 
(a) | (©, 30,0,0, —30) No No Selected Problems 
(b) (0, 20, 0, 0, —10) No No 
(c) (0, 10, 0, 0, 60) Yes Yes 
(d) | (0, 20, 0, 0, 10) Yes Yes 
(e) | (0, 20, 0, 0, 10) Yes Yes 
(f) | @, 10, 0, 0, 40) Yes No 
(g) (0, 20, 0, 0, 10) Yes Yes 
(h) (0, 20, 0, 0, 10, xs = — 10) No No 
(i) (0, 20, 0, 0, 0) Yes Yes 











Chapter 7 


Let x, be the shipment from plant i to distribution center j. Then x,, = 2, x44 = 10, 

Xoo = 9, x23 = 8, x3, = 10, x3. = 1; cost = $20,200. 

(Answer in millions of acres) England — 70 oats; France —> 110 wheat; Spain — 15 

wheat, 60 barley, 5 oats. 

(a) x1 = 3, Xyq = 2, Xn = 1, x33 = 1, x33 = 1, X34 = 2; four iterations to reach 
optimality. 

(b) and (c) xu = 3, xy = 0, x3 = 0, X14 = 2, X3 = 2, X32 = 3; already optimal. 

Xa = 10, x2 = 15, xy = 0, X93 = 5, X25 = 30, x33 = 20, x34 = 10, X44 = 10; 

cost = 77.30. Also have other tied optimal solutions. 

X14 = 20, xis = 50, x3 = 10, x4 = 10, x35 = 60, x37 = 60. 

Back — David, breast -> Tony, butterfly —> Chris, freestyle — Carl; time = 126.2. 




















Master Problem | Subproblem 1 | Subproblem 2 
3x, + 2x, = 18 x, s4 2x, 12 
Chapter & 


(xi; X2) = (10, 5) is optimal. 


Chapter 9 


(1, Xp. x3) = (1, 3, 1) with Z = 8 is optimal. 
@ Xo, x) = (3, 2, 0) with Z = 7# is optimal. 









New Optimal Solution 


(Œi, X2, X3, X4, X5) = (0, 0, 9, 3, 0) 
Œi, Xas X3, Xas Xs) = (0, 5, 5, 0, 0) 





Value of Z 


117 
90 





(a) 











(b) Range of 0 Optimal Solution ZO 
Os 652 i x2) = (0, 5) 120 — 100 
320 — 100 
256058 (x), X2) = Ce, 19) 3 


8s 0 (x1, x) = (5, 0) 40 + 50 
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Selected Problems 


as 


10. 


20. 


23. 


10. 


13. 
24. 


a 


11. 


Optimal Solution 





Range oro z 
0s0s1 30 + 60 
1s0s5 35 + 0 
5s0s25 50 — 26 
Chapter 10 


(a) O0—>A—>B—>D—>T o 0O—>A-—>B-—>E—>D-—>T, with length = 16. 
(a) {(0, A); (A, B); (B, C); (B, E); (E, D); (D, T)}, with length = 18. 


0 3 2 
















Event 112]3] 4 6 8 9 | 10 
Earliest time 0 6 3 5 0 10 li 14 13 20 
Latest time 0 7 3 6 11 10 3 14 5 20 
Slack Oo11!1otldl 1 0 0 0 


Critical path: 1 — 3 —> 6 —> 8 — 10. 


t, = 37,07 = 9. 
Chapter 11 
Store 
1: 2. 33 
Allocations ; : 7 








x = —2 + V13 ~ 1.6056, x, = 5 — V13 =~ 1.3944; Z = 98.233. 
Produce 2 on first production run; if none acceptable, produce 2 on second run. 
Expected cost = $573. 


Chapter 12 


(a) Player I: strategy 2; player II: strategy 1. 

(a) Politician I: issue 2; politician I: issue 2. 

(b) Politician I: issue 1; politician II: issue 2. 

(c) Minimax criterion says politician I can use any issue, but issue 1 offers politician I 
the only chance of winning if politician II is not ‘‘smart.”’ 

(a) Œp %) = G, 8); Or Yas y3) = G, 0, $) v = 5. 


14. 


10. 


18. 


20. 
22. 


16. 
25. 
29. 
33. 
36. 


50. 


58. 
61. 


Minimize —x4, 


subject to 5x, + 2x, + 3x3 -x,2= 0 
4x, + 2x; — x= 0 
3x, + 3x -x= 0 
x, + 2x, + 4x; — x,= 0 
Fe gS ==] 
and x, 20, xX, 20, x3; 20, x42 0. 
Chapter 13 


(b) (ong, medium, short) = (14, 0, 16), with profit of $9,560,000. 








(x1, Xp, X3, x4) = (0, 1, 1, 0), with Z = 36. 
(Xi; X2, X3, X4, Xs) = (0, 0, 1, 1, 1), with Z = 6. 


Chapter 14 


(a) Concave. 

Approximate solution = 1.0125. 

Exact solution is (x,, x) = (2, —2). 

(a) Approximate solution is (x,, x2) = (0.75, 1.875). 
(xi; x2) = (1, 2) cannot be optimal. 

(@) Gs 6) = 0- 371, 3712). 

(a) (x,, X2) = (2, 0) is optimal. 


(b) Minimize Z=2Z,+ Z, 
subject to 2x, Fur = yy + zi = 8 
2x, + uy, — y +z =4 
xt Xs +V =2 


x =0, x =0, wu =0, y, 20, y» z0, v 20, 420, B=. 


(© (%1, Xz Uys Vis Yar Vo Zo Za) = (2, 0, 4, 0, 0, 0, O, 0) is optimal. 
(b) Maximize Z = 2.625x,, — 17.625x,. + 4.5%, — 4.5x29, 
subject to Xa + X + 3xq, + 3Bxy = 8 
5x1, + 5x2 + 2x1 + 2x = 14 
and O<x,51.5, fori = 1,2 andj = 1,2. 


(a) (%), %2) = G, 5). 


1/3 1/2 
@ x) = E + G) 3 + (:) l maximizes P(x; r), so that (x,, x.) = (3, 3) is 


optimal. 


929 


Answers to 
Selected Problems 


930 


Answers to 5, 
Selected Problems 10. 


20. 


12. 
21. 


23. 


27. 


29. 
36. 


41. 


13. 
19. 


22. 





Chapter 15 


(a) All states belong to the same recurrent class. 

(b) m = 7, = ™ = 7, = ™| =F. 

(a) mo = 0.182, m, = 0.285, m, = 0.368, m, = 0.165. 
(b) 31.42. 


Chapter 16 


Input source: population having hair; customers: customers. needing haircuts; 
queue: customers waiting for a barber; queue discipline: first-come-first-served; 
service mechanism: barber(s). 





(a) 0.135. 

(b) 0.270. 

(c) 0.0527. 

(b) Po = $, P, = OG". 

(c) L = §, a= 3 W= ds, W, = oo. 

(a) 0.429, 

(b) 0.154. 

(c) 0.072. 

E (Not Running) 

0.718 
2.000 
1.015 
0.553 


(a) W, (exponential) = 2W, (constant) = $W, (Erlang). 

(b) W, (new) = 3W, (old) and L, (new) = L, (old) for all distributions. 
Current policy: L = 1; proposed policy: L = 4. 

(a) W=3. 

(b) W, = 0.20, W, = 0.35, Wy = 1.10. 

(c) W, = 0.125, W, = 0.3125, W; = 1.250. 


0.561 0.316 0.123 
0.571 0.286 0.143 


Chapter 17 












Service Distribution 


Erlang 
Exponential 


(a) EWC) = 16. 

(c) E(WC) = 263. 

Status quo: E(TC) = 50; proposal: E(TC) = 75.75; keep status quo. 
(a) Crew size = 2. 

(b) Crew size = 3. 

u = 1.15 minimizes E(TC). 


(a) E(T) = oe 
U 
67r 

(c) ET) = av 


One doctor: E(TC) = 624.80; two doctors: 92.95; have two doctors. 


15. 
18. 
22. 
24. 
28. 


31. 
34. 
39. 


42. 


Chapter 18 


(a) t = 1.83, Q = 54.77. 

(b) t = 1.91, Q = 57.45, S = 52.22. 

t = 3.26, Q = 26,046, S = 24,572. 

Produce 7 units in period 1 and 7 units in period 3. 
Produce 3 units in period | and 4 units in period 3. 
Produce 1,857 loaves. 

(s, S) = C1, 5). 

(@) GO) = Gay + Me5 — HB. 

(b) (k, Q) = (21, 100) policy. 

If x = 46, order 46 — x units; otherwise, do not order. 
If x = 2, order 2 — x units; otherwise, do not order. 
If x = y°, order y? — x units; otherwise, do not order. 
y = u — cll — @)/2. 

s = 24, Q = 58, (s, S) = (24, 82). 


Chapter 19 


2,090.5. 

783.2. 

(a) 10/80 7.167 

1/81 7.233 1/82 7.267 1/83 9.633 1/84 9.267 
4/81 7.733 4/82 8.067 4/83 10.367 4/84 8.867 
7/81 7.433 7/82 8.700 7/83 10.433 7/84 8.267 
10/81 7.500 10/82 9.467 10/83 10.267 10/84 7.967 

(b) 1.184. 

(a) 410.333 + 17.630r. 


(b) 604.267. 

(c) 3 431.6 6 445.220 9 468.880 
4 434.84 7 452.098 10 478.992 
5 439.356 8 460.089 11 490.193 

(d) 3 465.52 6 517.405 9 567.864 
4 483.075 7 534.236 10 585.671 
5 500.196 8 551.212 11 604.329 


Chapter 20 


(b) Use slow service when no customers or one customer is present and fast service 
when two customers are present. 


Minimize 3yo + 9yo2 + 3y + DV + 2By2, + 34yy, 
subject to Yor + You ~ Gyor + Yu + 2yo + $Y) = 0 
Yi + Yio — @yor + By + Šya + 2yo + 2y + Fn) = 0 
Ya + Yoo — (yn + Sya + yn + y2) = 0 
Yor + Yoo t Yu + yn + Yor + Yoo =l 
and Vn =O fori = 0, 1,2 and k = 1, 2. 


State 1: attempt ace; state 2: attempt lob. 
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22. 
23. 


24. 
31. 


32. 


33. 
41. 
48. 


a 


12. 


Minimize — yo, + zayo + byn + Wio 
subject to Yor + Yoo — Yor + Y + Yoz + Ji) = 0 
0 


il 


Yu + Y ~ Gyo + 8Yo2) 
Yo + Yoz t Yn t Ya = l 
and Yp = 0 fori = 0, 1 and k = 1, 2. 
Reject $600 offer, accept any of the other two. 
Minimize 60(yo9, + Yı + Y21) — 600yo2 — 800yı2 — 1,000y,,, 


coat 


subject to Yo. + Yoo — (0.95)(8)o1 + Yur + Yo) = 
Yu + ya Z OIDO + Yu + Yop = 


Yar + Yn T (0.95)(3)(¥o1 + yu + Yo) = 
and Yx20O fori =0,1,2andk = 1,2. 


le 


j= 


After three iterations, approximation is, in fact, the optimal policy given in Prob. 22. 
Use fertilizer B regardless of crop quality. 


Minimize —5,400(y9, + yı) ~ 6,200(¥o2 + ¥,2), 


subject to Yor + yo — @)B¥o1 + y1 + B¥02 + $Y) = 2 


Nie 


Ya + Yi — Ayo + 39% + Eyo + $Y) = 
and yn, 20 fori = 0, landk = 1, 2. 


After three iterations, approximation is the optimal policy given in Prob. 31. 

Use fertilizer B in all four periods regardless of crop quality. 

In periods 1 to 3, do nothing when the machine is in state 0 or 1; overhaul when 
machine is in state 2; and replace when machine is in state 3. In period 4, do nothing 
when machine is in state 0, 1, or 2; and replace when in state 3. 


Chapter 21 


Paths are {x,, x2} and {x,, x3}. 

P(x, X2, X3) = Max(X, X2, X; X3) = x, MAX(Xy, x3) = x,[1 — (1 — x) — x,)]. 

R(P,, Pa» P3) = pill — (1 — py) — p3)l- 

(a) Minimal paths are {x,, x3} and {x, x4}. 

"Minimal cuts are {x,, x}, {x,, x4}, {%2, x5}, and {x3, x4}. 

(b) R(1, Po» Ps) = 1 — (1 — pipa) — papa) = 0.9639 when p; = 0.90. 

(c) Upper bound = exact system reliability. 
Lower bound = (1 = 9142). — 9:94) — qag)! — q3q4) = 0.9606 when 
p: = 0.90, q; = 0.10. 

(a) 0.659 = R(t) = 1 

(b) 0 = RG) = 0.324. 


Chapter 22 


(a) az. 

(b) Up to $230,000. 

(c) az. 

(a) Guess coin 1. 

(b) Heads: coin 2; tails: coin 1. 


16. 


= 


il. 


18. 


21. 


(a) Produce the chip. 933 
(b) Up to $400,000. Answers to 
(c) — $2,300,000. It doesn’t pay to use market research. Selected Problems 
(d) Produce the chip. 

Bayes’ procedure without seismic soundings is a, with expected loss of ~ $68,000. 







E(Loss) 


— 137,700 
— 67,925 
— 33,000 

— 33,000 


Seismic Sounding | Bayes’ Action 











0.360 
0.260 
0.210 






ay 
ay 


AUN 





Value of seismic soundings is $781, so they should not be used. 


Chapter 23 


(a) 5, 8, 1, 4, 7, 0, 3, 6, 9, 2. 
(a) Assigning numbers 0, 1, 2, 3, 4 to heads and 5, 6, 7, 8, 9 to tails gives the 
sequence THHTT. 
(a) x = Vr. 
(a) x = —4ln(1 — r). 
(b) x —2 Inf -— r) — r). 
6 
(c)x=4 > r -8. 
i=] 


Use first 10 three-digit decimals from Table 23.4 and generate observations from 


Analytic | Monte Carlo 
œ 4.3969 


(a) Est{W,} = 23 and P{1.572 = W, = 3.094} = 0.90. 









Stratified Sampling | Complementary Numbers 


3.812 
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Answers to 
Selected Problems 


16. 


18. 


21. 


(a) Produce the chip. 

(b) Up to $400,000. 

(c) — $2,300,000. It doesn’t pay to use market research. 

(d) Produce the chip. 

Bayes’ procedure without seismic soundings is a, with expected loss of — $68,000. 


— 137,700 





Seismic Sounding 







0.360 


2 — 67,925 | 0.260 
3 — 33,000 | 0.210 
4 — 33,000 





Value of seismic soundings is $781, so they should not be used. 


Chapter 23 


(a) 5, 8, 1, 4, 7, 0, 3, 6, 9, 2. 
(a) Assigning numbers 0, 1, 2, 3, 4 to heads and 5, 6, 7, 8, 9 to tails gives the 
sequence THHTT. 


(a) x = Vr. 
(a)x = —4ing — r). 
(b) x = eG rD — r). 


()x=4> r -8. 
i=] 


Use first 10 three-digit decimals from Table 23.4 and generate observations from 


Analytic -| Monte Carlo 
co 4.3969 


(a) Est{W,} = 24 and P{1.572 = W, = 3.094} = 0.90. 












Stratified Sampling 
8.7661 


Complementary Numbers 


3.812 
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M/M/s model, 621-623, 657 

Finite queue, 595, 618—620 

Finite queue variation of M/M/s model, 
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Hyperexponential distribution, 633—635 


Hyperplane, 113 


Identity matrix, 125, 909 

IFORS, 9 

IFR (increasing failure rate), 821 

IFRA (increasing failure-rate average), 823 
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Integer nonlinear programming, 457, 
489 
Integer programming (IP), 42, 52, 213, 
240-242, 457-489 
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Interior-point algorithm for linear program- 
ming, an, 100-102, 104, 312-323 
centering scheme for, 316-318 
comparison with simplex method, 
101-102, 323 
computational complexity of, 101 
computer implementation of, 100-101 
concepts of, 313 
equality constraints in, 322 
gradient, relevance of, 313-314 
historical background of, 100-101, 313 
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Minimax criterion, 439, 442 
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Minimizing the maximum, 279 
Minimum cost flow problem, 250, 334, 
351-369, 382, 467 
algorithm for, 359-369 


Minimum cost flow problem (cont.): 
feasible solutions property for, 353 
formulation of, 352-354 
integer solutions property for, 353, 354, 

467 
special cases of, 334, 351-352, 
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n-step transition probabilities, 567 
Nearly optimal solution, 479 
(See also Suboptimal solution) 
Net present value, 458 
Network, 336-339, 347 
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Residuals, 753, 755 
Restricted entry rule, 526, 527 
Retrospective test, 23 
Revenue, 689, 690, 707-708 
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equality constraints in, 80—84 

foundations of, 112-123 

geometric interpretation of, 59-61, 
112-123 

greater-than-or-equal-to constraints in, 
84 

initialization step for, 64-65, 69, 72-73, 
79-93 

iterative step for, 65-71, 73-76 

matrix form for (see Revised simplex 
method) 

minimization with, 85—86 

minimum ratio test in, 73 

modified, for quadratic programming, 
526-528 


Simplex method (cont.): 
multiple optimal solutions in, 78-79, 
118 
negative right-hand sides in, 84-85 
network (see Network simplex method) 
with no feasible solutions, 91 
with no leaving basic variable, 77—78 
optimality test for, 64, 68-71, 73, 76 
post-optimality analysis with, 94-100, 
138-139, 169-191 
proper form from Gaussian elimination 
in, 65, 68, 81, 86 
summary, 69-70, 72-73 
in tabular form, 71-76 
tie breaking in, 76-79 
tie for entering basic variable in, 76 
tie for leaving basic variable in, 77 
transportation (see Transportation sim- 
plex method) 
with unbounded Z, 77-78 
with variables allowed to be negative, 
92-93 
bound on negative values allowed, 92 
no bound on negative values allowed, 
92-93 
Simplex tableau, 71-72 
SIMSCRIPT, 872-873 
Simulation, 645, 856-888 
applications of, 863—865 
clock, 858, 860-863 
cycles in, 880-886 
event generation mechanism for, 858, 
860 
experimental design for, 874-880 
length of cycle in, 881-886 
programming languages for, 872-873 
random number generation for, 
866-868 
random observation generation for, 
869-872 
regenerative method of statistical analy- 
sis for; 880-888 
stabilization period for, 879 
starting conditions for, 879-880, 888 
state transition mechanism for, 858, 860 
tactical problems in, 879-880, 888 
time advance mechanism for, 861-863 
Simulation clock, 858, 860-863 
Simulation model, 857-858, 860—873 
validation of, 873 
Simulation program, 872-873 
Simulation programming languages, 
872-873 
Simultaneous linear equations, 913-915 


Singular matrix, 911 

Sink, 339 

Sink node, 339 

Size of cycle (see Length of cycle) 

Slack for an activity, 373 

Slack for an event, 373 

Slack variable, 61-62, 124-125 

Social cost, 661 

Social responsibilities, 18 

Social service system, 600, 661, 664 

Software packages, 20-21, 98-99, 101, 
103-104, 333-334, 488, 543 

Solution, 36-37 

Solution tree, 470, 477, 485 

Source, 212, 339 

Source node, 339 

Spanning tree, 338, 341-342, 354 

feasible, 361-362 

Spanning tree solution, 361 

Special structure, 208, 212, 243, 248-250, 
467, 488 

Special-purpose algorithm (see Special 
structure) 

Stable solution, 440 

Stagecoach problem, 394-398 

Stages, 394, 399 

State-dependent arrival rate, 624-627 

State-dependent service rate, 623-628 

State of the system, 598, 857, 858, 860, 
881 

State transition mechanism, 858, 860 

States, 394, 399, 403, 562, 858 

States of nature, 829 

Stationary distribution, 598 

Stationary probabilities, 574 

Stationary transition probabilities, 563, 581 

Steady state, 574 

Steady-state condition, 598-599 

Steady-state probabilities, 574 

Stochastic process, 562-563 

Stochastic system, 857, 860 

Stopping rule of algorithm, 59 

Storage cost, 689 

Strata, 877 

Strategy, 435 

Stratified sampling, 876-878 

Streamlined procedure for preemptive goal 
programming, 275, 277 

Strictly concave function, 897-899 

Strictly convex function, 897-899 

Strong duality property, 156-158, 447, 
523 

Structure function, 811 

Student’s ż distribution, 917 


Submatrices, 909 953 
Suboptimal basic solution, 164-165, 305 Subject Index 
Suboptimal solution, 18, 22, 479 
Suboptimization, 18, 22, 479 
Subproblem, 246, 469, 473, 477-478, 
480—482 
Successive approximations, 788-792 
SUMT (Sequential unconstrained minimi- 
zation technique), 540-543 
Superoptimal basic solution, 164-165, 
176-177, 305 
Supply node, 339, 346, 352 
Surplus variable, 85 
Symmetry property, 157, 166-169 
System: 
coherent, 813 
k out of n, 812-813 
monotone, 813 
parallel, 812 
reliability of, 813—815 
series, 811-812 


t distribution, 917 

Table look-up approach, 870 

Table of constraint coefficients, 208-209, 
211, 212, 246, 249 

Tabular algorithm for posterior distribution, 
838 


- Testing the model and the solution, 


22-23 
Time advance mechanism, 861—863 
Time-cost curve, 376-377, 380 
Time-series, 743-755 
TIMS (The Institute of Management 
Sciences), 9 
Tractability, 20 
Tradeoffs, 20, 98, 287, 290, 376-381 
Transient condition, 598-599, 607 
Transient state, 569 
Transition intensity, 583 
Transition probabilities, 563-566 
n-step, 567 
one-step, 563 
Transportation network, 342, 346 
Transportation problem, 208-234, 
240-244, 250, 290, 334, 351-352, 
354-355, 382, 467, 501-502 
cost and requirements table for, 212 
demand in, 212 
destinations in, 212 
dummy destination in, 214, 216 
dummy source in, 214, 218 
feasible solutions property for, 213-214 


954 
Subject Index 


Transportation problem (cont.): 
integer solutions property for, 213, 
241-242, 467 
model of, 212-214 
sources in, 212 
supply in, 212 
volume discounts on shipping costs in, 
501-502 
Transportation service system, 600, 661, 
664 
Transportation simplex method, 219-234, 
239, 242-243, 290, 358, 359, 365 
degeneracy in, 233-235 
donor cell in, 231 
initialization step in, 222-228, 232 
minimum cost criterion for, 256 
northwest corner rule for, 224, 
227-228 
Russell’s approximation method for, 
225, 227-228 
Vogel’s approximation method for, 
225, 226, 228 
iterative step in, 230-235 
optimality test in, 229-230, 233° 
recipient cell in, 231 
summary, 232-233 
Transportation simplex tableau, 221 
Transpose of a matrix, 908 
Transshipment(s), 234 
Transshipment node, 339, 346, 352 
Transshipment points, 239 
Transshipment problem, 209, 234-239, 
250, 334, 352, 355, 358, 382, 467 
Travel-time cost, 671—672 
Travel-time models, 656, 672—678 
Tree, 338, 341-343, 470, 477, 485 
enumeration, 470 
minimum spanning, 341-343 
solution, 470, 477, 485 
spanning, 338, 341-342, 354 
Triangular distribution, 871-872 
Trigger point, 726 
Turbo-Simplex, 103 
Two-person zero-sum game, 434 
Two-phase method, 88-90, 104, 526 


Unconstrained optimization, 507, 511-519, 
540-543, 902-904 
Undirected arc, 336 


Undirected cycle, 337-338 

Undirected network, 336 

Undirected path, 337 

Uniform distribution, 720, 728, 729, 861, 
867, 870, 881 

Uniform random number, 861, 863, 867 

Unlimited input source, 595 

Unsatisfied demand, 689 

Unstable solution, 440 

Upper bound constraint, 248-250, 302-304 

Upper bound technique, 301-304, 323, 
359-360, 366-368, 532, 534 

Utility, 435-436, 503, 830, 843 

Utilization factor, 598 


Value determination, 780, 785 
Value of experimentation, 839-840 
Value of the game, 438, 439, 442, 831 
Variable metric methods, 539 
Variables allowed to be negative, 36, 
92-93, 268-270 
Variance-reducing techniques, 875-879 
Vector of basic variables, 125 
Vectors, 909-911] 
linearly dependent, 910 
linearly independent, 910-911 
Vertices, of network, 336 
VINO, 103 
Vogel’s approximation method, 225, 226, 
228 


Waiting cost, 659-667 
Waiting-cost functions, 663—667 
g(N) form of, 663-664 
h(W) form of, 664-667 
Water-resource model, 792-796 
Water resources, distribution of, 217-219 
Weak duality property, 156-158 
What’s Best?, 103 
Wyndor Glass Co. problem, 30-31 


XA, 103 
XPRESS-LP, 103 


Yes-or-no decision, 458 
YKTLP, 102 


Instructions for Using 
OR COURSEWARE 


IBM Compatible Computers 


1. 
2. 
3. 
4. 


Insert diskette in drive A. 

Make drive A the current drive. 

Type the name of the program (for example, LinProg) and press <ENTER>. 
Programs may be copied to and run from a hard disk. 


Macintosh Computers 


1. 


2. 


Insert the diskette, and double click on the program icon (for example, 
LinProg). 

If the computer you are using does not have arrow keys, use the following 
substitutes: 


UP 
{ 
LEFT [ ] RIGHT 
} 


DOWN 


. Programs may be copied to and run from a hard disk. 
. If a system folder is needed, one is included on the ProbMod diskette. The 


system folder will fit with LinProg or MathProg on separate diskettes. 
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